Recurrence-free unconstrained handwritten text recognition using gated
fully convolutional network
- URL: http://arxiv.org/abs/2012.04961v1
- Date: Wed, 9 Dec 2020 10:30:13 GMT
- Title: Recurrence-free unconstrained handwritten text recognition using gated
fully convolutional network
- Authors: Denis Coquenet, Cl\'ement Chatelain, Thierry Paquet
- Abstract summary: Unconstrained handwritten text recognition is a major step in most document analysis tasks.
One alternative solution to using LSTM cells is to compensate the long time memory loss with an heavy use of convolutional layers.
We present a Gated Fully Convolutional Network architecture that is a recurrence-free alternative to the well-known CNN+LSTM architectures.
- Score: 2.277447144331876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unconstrained handwritten text recognition is a major step in most document
analysis tasks. This is generally processed by deep recurrent neural networks
and more specifically with the use of Long Short-Term Memory cells. The main
drawbacks of these components are the large number of parameters involved and
their sequential execution during training and prediction. One alternative
solution to using LSTM cells is to compensate the long time memory loss with an
heavy use of convolutional layers whose operations can be executed in parallel
and which imply fewer parameters. In this paper we present a Gated Fully
Convolutional Network architecture that is a recurrence-free alternative to the
well-known CNN+LSTM architectures. Our model is trained with the CTC loss and
shows competitive results on both the RIMES and IAM datasets. We release all
code to enable reproduction of our experiments:
https://github.com/FactoDeepLearning/LinePytorchOCR.
Related papers
- Scaling up ridge regression for brain encoding in a massive individual fMRI dataset [1.740992908651449]
This paper evaluates different parallelization techniques to reduce the training time of brain encoding with ridge regression.
With multi-threading, our results show that the Intel Math Kernel Library (MKL) significantly outperforms the OpenBLAS library.
We propose a new "batch" version of Dask parallelization, motivated by a time complexity analysis.
arXiv Detail & Related papers (2024-03-28T13:52:12Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - Image Classification using Sequence of Pixels [3.04585143845864]
This study compares sequential image classification methods based on recurrent neural networks.
We describe methods based on Long-Short-Term memory(LSTM), bidirectional Long-Short-Term memory(BiLSTM) architectures, etc.
arXiv Detail & Related papers (2022-09-23T09:42:44Z) - GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction [50.248694764703714]
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction.
These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization.
We propose Greedy LEarning for Accelerated MRI reconstruction, an efficient training strategy for high-dimensional imaging settings.
arXiv Detail & Related papers (2022-07-18T06:01:29Z) - Working Memory Connections for LSTM [51.742526187978726]
We show that Working Memory Connections constantly improve the performance of LSTMs on a variety of tasks.
Numerical results suggest that the cell state contains useful information that is worth including in the gate structure.
arXiv Detail & Related papers (2021-08-31T18:01:30Z) - Recursively Refined R-CNN: Instance Segmentation with Self-RoI
Rebalancing [2.4634850020708616]
We propose Recursively Refined R-CNN ($R3$-CNN) which avoids duplicates by introducing a loop mechanism instead.
Our experiments highlight the specific encoding of the loop mechanism in the weights, requiring its usage at inference time.
The architecture is able to surpass the recently proposed HTC model, while reducing the number of parameters significantly.
arXiv Detail & Related papers (2021-04-03T07:25:33Z) - Have convolutions already made recurrence obsolete for unconstrained
handwritten text recognition ? [3.0969191504482247]
Unconstrained handwritten text recognition remains an important challenge for deep neural networks.
recurrent networks and Long Short-Term Memory networks have achieved state-of-the-art performance in this field.
We propose an experimental study regarding different architectures on an offline handwriting recognition task using the RIMES dataset.
arXiv Detail & Related papers (2020-12-09T10:15:24Z) - EASTER: Efficient and Scalable Text Recognizer [0.0]
We present an Efficient And Scalable TExt Recognizer (EASTER) to perform optical character recognition on both machine printed and handwritten text.
Our model utilise 1-D convolutional layers without any recurrence which enables parallel training with considerably less volume of data.
We also showcase improvements over the current best results on offline handwritten text recognition task.
arXiv Detail & Related papers (2020-08-18T10:26:03Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Recognizing Long Grammatical Sequences Using Recurrent Networks
Augmented With An External Differentiable Stack [73.48927855855219]
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction.
RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems.
One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack.
In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms.
arXiv Detail & Related papers (2020-04-04T14:19:15Z) - Depth-Adaptive Graph Recurrent Network for Text Classification [71.20237659479703]
Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network.
We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
arXiv Detail & Related papers (2020-02-29T03:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.