Uncovering the Handwritten Text in the Margins: End-to-end Handwritten
Text Detection and Recognition
- URL: http://arxiv.org/abs/2303.05929v2
- Date: Mon, 29 Jan 2024 19:23:39 GMT
- Title: Uncovering the Handwritten Text in the Margins: End-to-end Handwritten
Text Detection and Recognition
- Authors: Liang Cheng, Jonas Frankem\"olle, Adam Axelsson and Ekta Vats
- Abstract summary: This work presents an end-to-end framework for automatic detection and recognition of handwritten marginalia.
It uses data augmentation and transfer learning to overcome training data scarcity.
The effectiveness of the proposed framework has been empirically evaluated on the data from early book collections found in the Uppsala University Library in Sweden.
- Score: 0.840835093659811
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The pressing need for digitization of historical documents has led to a
strong interest in designing computerised image processing methods for
automatic handwritten text recognition. However, not much attention has been
paid on studying the handwritten text written in the margins, i.e. marginalia,
that also forms an important source of information. Nevertheless, training an
accurate and robust recognition system for marginalia calls for data-efficient
approaches due to the unavailability of sufficient amounts of annotated
multi-writer texts. Therefore, this work presents an end-to-end framework for
automatic detection and recognition of handwritten marginalia, and leverages
data augmentation and transfer learning to overcome training data scarcity. The
detection phase involves investigation of R-CNN and Faster R-CNN networks. The
recognition phase includes an attention-based sequence-to-sequence model, with
ResNet feature extraction, bidirectional LSTM-based sequence modeling, and
attention-based prediction of marginalia. The effectiveness of the proposed
framework has been empirically evaluated on the data from early book
collections found in the Uppsala University Library in Sweden. Source code and
pre-trained models are available at Github.
Related papers
- Semantic Meta-Split Learning: A TinyML Scheme for Few-Shot Wireless Image Classification [50.28867343337997]
This work presents a TinyML-based semantic communication framework for few-shot wireless image classification.
We exploit split-learning to limit the computations performed by the end-users while ensuring privacy-preserving.
meta-learning overcomes data availability concerns and speeds up training by utilizing similarly trained tasks.
arXiv Detail & Related papers (2024-09-03T05:56:55Z) - Classification of Non-native Handwritten Characters Using Convolutional Neural Network [0.0]
The classification of English characters written by non-native users is performed by proposing a custom-tailored CNN model.
We train this CNN with a new dataset called the handwritten isolated English character dataset.
The proposed model with five convolutional layers and one hidden layer outperforms state-of-the-art models in terms of character recognition accuracy.
arXiv Detail & Related papers (2024-06-06T21:08:07Z) - Self-Supervised Representation Learning for Online Handwriting Text
Classification [0.8594140167290099]
We propose the novel Part of Stroke Masking (POSM) as a pretext task for pretraining models to extract informative representations from the online handwriting of individuals in English and Chinese languages.
To evaluate the quality of the extracted representations, we use both intrinsic and extrinsic evaluation methods.
The pretrained models are fine-tuned to achieve state-of-the-art results in tasks such as writer identification, gender classification, and handedness classification.
arXiv Detail & Related papers (2023-10-10T14:07:49Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign
Language Understanding [132.78015553111234]
Hand gesture serves as a crucial role during the expression of sign language.
Current deep learning based methods for sign language understanding (SLU) are prone to over-fitting due to insufficient sign data resource.
We propose the first self-supervised pre-trainable SignBERT+ framework with model-aware hand prior incorporated.
arXiv Detail & Related papers (2023-05-08T17:16:38Z) - ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval.
It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding.
We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z) - AttentionHTR: Handwritten Text Recognition Based on Attention
Encoder-Decoder Networks [0.0]
This work proposes an attention-based sequence-to-sequence model for handwritten word recognition.
It exploits models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models.
The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset.
arXiv Detail & Related papers (2022-01-23T22:48:36Z) - Continuous Offline Handwriting Recognition using Deep Learning Models [0.0]
Handwritten text recognition is an open problem of great interest in the area of automatic document image analysis.
We have proposed a new recognition model based on integrating two types of deep learning architectures: convolutional neural networks (CNN) and sequence-to-sequence (seq2seq)
The new proposed model provides competitive results with those obtained with other well-established methodologies.
arXiv Detail & Related papers (2021-12-26T07:31:03Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Object Detection Based Handwriting Localization [2.6641834518599308]
We present an object detection based approach to localize handwritten regions from documents.
The proposed approach is also expected to facilitate other tasks such as handwriting recognition and signature verification.
arXiv Detail & Related papers (2021-06-28T21:25:20Z) - Towards Accurate Scene Text Recognition with Semantic Reasoning Networks [52.86058031919856]
We propose a novel end-to-end trainable framework named semantic reasoning network (SRN) for accurate scene text recognition.
GSRM is introduced to capture global semantic context through multi-way parallel transmission.
Results on 7 public benchmarks, including regular text, irregular text and non-Latin long text, verify the effectiveness and robustness of the proposed method.
arXiv Detail & Related papers (2020-03-27T09:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.