Continuous Offline Handwriting Recognition using Deep Learning Models
- URL: http://arxiv.org/abs/2112.13328v1
- Date: Sun, 26 Dec 2021 07:31:03 GMT
- Title: Continuous Offline Handwriting Recognition using Deep Learning Models
- Authors: Jorge Sueiras
- Abstract summary: Handwritten text recognition is an open problem of great interest in the area of automatic document image analysis.
We have proposed a new recognition model based on integrating two types of deep learning architectures: convolutional neural networks (CNN) and sequence-to-sequence (seq2seq)
The new proposed model provides competitive results with those obtained with other well-established methodologies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Handwritten text recognition is an open problem of great interest in the area
of automatic document image analysis. The transcription of handwritten content
present in digitized documents is significant in analyzing historical archives
or digitizing information from handwritten documents, forms, and
communications. In the last years, great advances have been made in this area
due to applying deep learning techniques to its resolution. This Thesis
addresses the offline continuous handwritten text recognition (HTR) problem,
consisting of developing algorithms and models capable of transcribing the text
present in an image without the need for the text to be segmented into
characters. For this purpose, we have proposed a new recognition model based on
integrating two types of deep learning architectures: convolutional neural
networks (CNN) and sequence-to-sequence (seq2seq) models, respectively. The
convolutional component of the model is oriented to identify relevant features
present in characters, and the seq2seq component builds the transcription of
the text by modeling the sequential nature of the text. For the design of this
new model, an extensive analysis of the capabilities of different convolutional
architectures in the simplified problem of isolated character recognition has
been carried out in order to identify the most suitable ones to be integrated
into the continuous model. Additionally, extensive experimentation of the
proposed model for the continuous problem has been carried out to determine its
robustness to changes in parameterization. The generalization capacity of the
model has also been validated by evaluating it on three handwritten text
databases using different languages: IAM in English, RIMES in French, and
Osborne in Spanish, respectively. The new proposed model provides competitive
results with those obtained with other well-established methodologies.
Related papers
- Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model.
A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances.
Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z) - Revisiting N-Gram Models: Their Impact in Modern Neural Networks for Handwritten Text Recognition [4.059708117119894]
This study addresses whether explicit language models, specifically n-gram models, still contribute to the performance of state-of-the-art deep learning architectures in the field of handwriting recognition.
We evaluate two prominent neural network architectures, PyLaia and DAN, with and without the integration of explicit n-gram language models.
The results show that incorporating character or subword n-gram models significantly improves the performance of ATR models on all datasets.
arXiv Detail & Related papers (2024-04-30T07:37:48Z) - A Transformer-based Approach for Arabic Offline Handwritten Text
Recognition [0.0]
We introduce two alternative architectures for recognizing offline Arabic handwritten text.
Our approach can model language dependencies and relies only on the attention mechanism, thereby making it more parallelizable and less complex.
Our evaluation on the Arabic KHATT dataset demonstrates that our proposed method outperforms the current state-of-the-art approaches.
arXiv Detail & Related papers (2023-07-27T17:51:52Z) - Uncovering the Handwritten Text in the Margins: End-to-end Handwritten
Text Detection and Recognition [0.840835093659811]
This work presents an end-to-end framework for automatic detection and recognition of handwritten marginalia.
It uses data augmentation and transfer learning to overcome training data scarcity.
The effectiveness of the proposed framework has been empirically evaluated on the data from early book collections found in the Uppsala University Library in Sweden.
arXiv Detail & Related papers (2023-03-10T14:00:53Z) - SLCNN: Sentence-Level Convolutional Neural Network for Text
Classification [0.0]
Convolutional neural network (CNN) has shown remarkable success in the task of text classification.
New baseline models have been studied for text classification using CNN.
Results have shown that the proposed models have better performance, particularly in the longer documents.
arXiv Detail & Related papers (2023-01-27T13:16:02Z) - Boosting Modern and Historical Handwritten Text Recognition with
Deformable Convolutions [52.250269529057014]
Handwritten Text Recognition (HTR) in free-volution pages is a challenging image understanding task.
We propose to adopt deformable convolutions, which can deform depending on the input at hand and better adapt to the geometric variations of the text.
arXiv Detail & Related papers (2022-08-17T06:55:54Z) - How much do language models copy from their training data? Evaluating
linguistic novelty in text generation using RAVEN [63.79300884115027]
Current language models can generate high-quality text.
Are they simply copying text they have seen before, or have they learned generalizable linguistic abstractions?
We introduce RAVEN, a suite of analyses for assessing the novelty of generated text.
arXiv Detail & Related papers (2021-11-18T04:07:09Z) - Artificial Text Detection via Examining the Topology of Attention Maps [58.46367297712477]
We propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA)
We empirically show that the features derived from the BERT model outperform count- and neural-based baselines up to 10% on three common datasets.
The probing analysis of the features reveals their sensitivity to the surface and syntactic properties.
arXiv Detail & Related papers (2021-09-10T12:13:45Z) - One-shot Compositional Data Generation for Low Resource Handwritten Text
Recognition [10.473427493876422]
Low resource Handwritten Text Recognition is a hard problem due to the scarce annotated data and the very limited linguistic information.
In this paper we address this problem through a data generation technique based on Bayesian Program Learning.
Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet.
arXiv Detail & Related papers (2021-05-11T18:53:01Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Neural Deepfake Detection with Factual Structure of Text [78.30080218908849]
We propose a graph-based model for deepfake detection of text.
Our approach represents the factual structure of a given document as an entity graph.
Our model can distinguish the difference in the factual structure between machine-generated text and human-written text.
arXiv Detail & Related papers (2020-10-15T02:35:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.