DARE: A large-scale handwritten date recognition system
- URL: http://arxiv.org/abs/2210.00503v1
- Date: Sun, 2 Oct 2022 12:47:36 GMT
- Title: DARE: A large-scale handwritten date recognition system
- Authors: Christian M. Dahl, Torben S. D. Johansen, Emil N. S{\o}rensen,
Christian E. Westermann, Simon F. Wittrock
- Abstract summary: We introduce a database containing almost 10 million tokens, originating from more than 2.2 million handwritten dates.
We show that training on handwritten text with high variability in writing styles result in robust models for general handwritten text recognition.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Handwritten text recognition for historical documents is an important task
but it remains difficult due to a lack of sufficient training data in
combination with a large variability of writing styles and degradation of
historical documents. While recurrent neural network architectures are commonly
used for handwritten text recognition, they are often computationally expensive
to train and the benefit of recurrence drastically differs by task. For these
reasons, it is important to consider non-recurrent architectures. In the
context of handwritten date recognition, we propose an architecture based on
the EfficientNetV2 class of models that is fast to train, robust to parameter
choices, and accurately transcribes handwritten dates from a number of sources.
For training, we introduce a database containing almost 10 million tokens,
originating from more than 2.2 million handwritten dates which are segmented
from different historical documents. As dates are some of the most common
information on historical documents, and with historical archives containing
millions of such documents, the efficient and automatic transcription of dates
has the potential to lead to significant cost-savings over manual
transcription. We show that training on handwritten text with high variability
in writing styles result in robust models for general handwritten text
recognition and that transfer learning from the DARE system increases
transcription accuracy substantially, allowing one to obtain high accuracy even
when using a relatively small training sample.
Related papers
- Handwriting Recognition in Historical Documents with Multimodal LLM [0.0]
Multimodal Language Models have demonstrated effectiveness in performing OCR and computer vision tasks with few shot prompting.
I evaluate the accuracy of handwritten document transcriptions generated by Gemini against the current state of the art Transformer based methods.
arXiv Detail & Related papers (2024-10-31T15:32:14Z) - Contrastive Entity Coreference and Disambiguation for Historical Texts [2.446672595462589]
Existing entity disambiguation methods often fall short in accuracy for historical documents, which are replete with individuals not remembered in contemporary knowledgebases.
This study makes three key contributions to improve cross-document coreference resolution and disambiguation in historical texts.
arXiv Detail & Related papers (2024-06-21T18:22:14Z) - How to Choose Pretrained Handwriting Recognition Models for Single
Writer Fine-Tuning [23.274139396706264]
Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on modern and historical manuscripts.
Those models struggle to obtain the same performance when applied to manuscripts with peculiar characteristics, such as language, paper support, ink, and author handwriting.
In this paper, we take into account large, real benchmark datasets and synthetic ones obtained with a styled Handwritten Text Generation model.
We give a quantitative indication of the most relevant characteristics of such data for obtaining an HTR model able to effectively transcribe manuscripts in small collections with as little as five real fine-tuning lines
arXiv Detail & Related papers (2023-05-04T07:00:28Z) - An end-to-end, interactive Deep Learning based Annotation system for
cursive and print English handwritten text [0.0]
We present an innovative, complete end-to-end pipeline, that annotates offline handwritten manuscripts written in both print and cursive English.
This novel method involves an architectural combination of a detection system built upon a state-of-the-art text detection model, and a custom made Deep Learning model for the recognition system.
arXiv Detail & Related papers (2023-04-18T00:24:07Z) - Uncovering the Handwritten Text in the Margins: End-to-end Handwritten
Text Detection and Recognition [0.840835093659811]
This work presents an end-to-end framework for automatic detection and recognition of handwritten marginalia.
It uses data augmentation and transfer learning to overcome training data scarcity.
The effectiveness of the proposed framework has been empirically evaluated on the data from early book collections found in the Uppsala University Library in Sweden.
arXiv Detail & Related papers (2023-03-10T14:00:53Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Boosting Modern and Historical Handwritten Text Recognition with
Deformable Convolutions [52.250269529057014]
Handwritten Text Recognition (HTR) in free-volution pages is a challenging image understanding task.
We propose to adopt deformable convolutions, which can deform depending on the input at hand and better adapt to the geometric variations of the text.
arXiv Detail & Related papers (2022-08-17T06:55:54Z) - Digital Editions as Distant Supervision for Layout Analysis of Printed
Books [76.29918490722902]
We describe methods for exploiting this semantic markup as distant supervision for training and evaluating layout analysis models.
In experiments with several model architectures on the half-million pages of the Deutsches Textarchiv (DTA), we find a high correlation of these region-level evaluation methods with pixel-level and word-level metrics.
We discuss the possibilities for improving accuracy with self-training and the ability of models trained on the DTA to generalize to other historical printed books.
arXiv Detail & Related papers (2021-12-23T16:51:53Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - SmartPatch: Improving Handwritten Word Imitation with Patch
Discriminators [67.54204685189255]
We propose SmartPatch, a new technique increasing the performance of current state-of-the-art methods.
We combine the well-known patch loss with information gathered from the parallel trained handwritten text recognition system.
This leads to a more enhanced local discriminator and results in more realistic and higher-quality generated handwritten words.
arXiv Detail & Related papers (2021-05-21T18:34:21Z) - Conditioned Text Generation with Transfer for Closed-Domain Dialogue
Systems [65.48663492703557]
We show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder.
We introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset.
arXiv Detail & Related papers (2020-11-03T14:06:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.