Lexicon and Attention based Handwritten Text Recognition System
- URL: http://arxiv.org/abs/2209.04817v1
- Date: Sun, 11 Sep 2022 09:26:45 GMT
- Title: Lexicon and Attention based Handwritten Text Recognition System
- Authors: Lalita Kumari, Sukhdeep Singh, VVS Rathore and Anuj Sharma
- Abstract summary: We have taken two state-of-the art neural networks systems and merged the attention mechanism with it.
We are able to achieve 4.15% character error rate and 9.72% word error rate on IAM dataset, 7.07% character error rate and 16.14% word error rate on GW dataset.
- Score: 3.9097549127191473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The handwritten text recognition problem is widely studied by the researchers
of computer vision community due to its scope of improvement and applicability
to daily lives, It is a sub-domain of pattern recognition. Due to advancement
of computational power of computers since last few decades neural networks
based systems heavily contributed towards providing the state-of-the-art
handwritten text recognizers. In the same direction, we have taken two
state-of-the art neural networks systems and merged the attention mechanism
with it. The attention technique has been widely used in the domain of neural
machine translations and automatic speech recognition and now is being
implemented in text recognition domain. In this study, we are able to achieve
4.15% character error rate and 9.72% word error rate on IAM dataset, 7.07%
character error rate and 16.14% word error rate on GW dataset after merging the
attention and word beam search decoder with existing Flor et al. architecture.
To analyse further, we have also used system similar to Shi et al. neural
network system with greedy decoder and observed 23.27% improvement in character
error rate from the base model.
Related papers
- GatedLexiconNet: A Comprehensive End-to-End Handwritten Paragraph Text Recognition System [3.9527064697847005]
We present an end-to-end paragraph recognition system that incorporates internal line segmentation and convolutional layers based encoder.
This study reported character error rates of 2.27% on IAM, 0.9% on RIMES, and 2.13% on READ-16, and word error rates of 5.73% on READ-2016 datasets.
arXiv Detail & Related papers (2024-04-22T10:19:16Z) - Cracking the neural code for word recognition in convolutional neural networks [1.0991358618541507]
We show how a small subset of units becomes specialized for word recognition in the learned script.
We show that these units are sensitive to specific letter identities and their distance from the blank space at the left or right of a word.
The proposed neural code provides a mechanistic insight into how information on letter identity and position is extracted and allow for invariant word recognition.
arXiv Detail & Related papers (2024-03-10T10:12:32Z) - Offline Detection of Misspelled Handwritten Words by Convolving
Recognition Model Features with Text Labels [0.0]
We introduce the task of comparing a handwriting image to text.
Our model's classification head is trained entirely on synthetic data created using a state-of-the-art generative adversarial network.
Such massive performance gains can lead to significant productivity increases in applications utilizing human-in-the-loop automation.
arXiv Detail & Related papers (2023-09-18T21:13:42Z) - Agile gesture recognition for capacitive sensing devices: adapting
on-the-job [55.40855017016652]
We demonstrate a hand gesture recognition system that uses signals from capacitive sensors embedded into the etee hand controller.
The controller generates real-time signals from each of the wearer five fingers.
We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms.
arXiv Detail & Related papers (2023-05-12T17:24:02Z) - A Lexicon and Depth-wise Separable Convolution Based Handwritten Text
Recognition System [3.9097549127191473]
We have used depthwise convolution in place of standard convolutions to reduce the total number of parameters to be trained.
We have also included a lexicon based word beam search decoder at testing step.
We have obtained 3.84% character error rate and 9.40% word error rate on IAM dataset; 4.88% character error rate and 14.56% word error rate in George Washington dataset.
arXiv Detail & Related papers (2022-07-11T06:24:26Z) - SAFL: A Self-Attention Scene Text Recognizer with Focal Loss [4.462730814123762]
Scene text recognition remains challenging due to inherent problems such as distortions or irregular layout.
Most of the existing approaches mainly leverage recurrence or convolution-based neural networks.
We introduce SAFL, a self-attention-based neural network model with the focal loss for scene text recognition.
arXiv Detail & Related papers (2022-01-01T06:51:03Z) - Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot
Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks.
In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks.
Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Instant One-Shot Word-Learning for Context-Specific Neural
Sequence-to-Sequence Speech Recognition [62.997667081978825]
We present an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
In this paper we demonstrate that through this mechanism our system is able to recognize more than 85% of newly added words that it previously failed to recognize.
arXiv Detail & Related papers (2021-07-05T21:08:34Z) - SmartPatch: Improving Handwritten Word Imitation with Patch
Discriminators [67.54204685189255]
We propose SmartPatch, a new technique increasing the performance of current state-of-the-art methods.
We combine the well-known patch loss with information gathered from the parallel trained handwritten text recognition system.
This leads to a more enhanced local discriminator and results in more realistic and higher-quality generated handwritten words.
arXiv Detail & Related papers (2021-05-21T18:34:21Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.