Handwritten Stenography Recognition and the LION Dataset
- URL: http://arxiv.org/abs/2308.07799v1
- Date: Tue, 15 Aug 2023 14:25:53 GMT
- Title: Handwritten Stenography Recognition and the LION Dataset
- Authors: Raphaela Heil, Malin Nauwerck
- Abstract summary: Stenographic domain knowledge is integrated by applying four different encoding methods.
Test error rates are reduced significantly by combining stenography-specific target sequence encodings with pre-training and fine-tuning.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Purpose: In this paper, we establish a baseline for handwritten stenography
recognition, using the novel LION dataset, and investigate the impact of
including selected aspects of stenographic theory into the recognition process.
We make the LION dataset publicly available with the aim of encouraging future
research in handwritten stenography recognition.
Methods: A state-of-the-art text recognition model is trained to establish a
baseline. Stenographic domain knowledge is integrated by applying four
different encoding methods that transform the target sequence into
representations, which approximate selected aspects of the writing system.
Results are further improved by integrating a pre-training scheme, based on
synthetic data.
Results: The baseline model achieves an average test character error rate
(CER) of 29.81% and a word error rate (WER) of 55.14%. Test error rates are
reduced significantly by combining stenography-specific target sequence
encodings with pre-training and fine-tuning, yielding CERs in the range of
24.5% - 26% and WERs of 44.8% - 48.2%.
Conclusion: The obtained results demonstrate the challenging nature of
stenography recognition. Integrating stenography-specific knowledge, in
conjunction with pre-training and fine-tuning on synthetic data, yields
considerable improvements. Together with our precursor study on the subject,
this is the first work to apply modern handwritten text recognition to
stenography. The dataset and our code are publicly available via Zenodo.
Related papers
- Affinity-Graph-Guided Contractive Learning for Pretext-Free Medical Image Segmentation with Minimal Annotation [55.325956390997]
This paper proposes an affinity-graph-guided semi-supervised contrastive learning framework (Semi-AGCL) for medical image segmentation.
The framework first designs an average-patch-entropy-driven inter-patch sampling method, which can provide a robust initial feature space.
With merely 10% of the complete annotation set, our model approaches the accuracy of the fully annotated baseline, manifesting a marginal deviation of only 2.52%.
arXiv Detail & Related papers (2024-10-14T10:44:47Z) - JSTR: Judgment Improves Scene Text Recognition [0.0]
We present a method for enhancing the accuracy of scene text recognition tasks by judging whether the image and text match each other.
This method boosts text recognition accuracy by providing explicit feedback on the data that the model is likely to misrecognize.
arXiv Detail & Related papers (2024-04-09T02:55:12Z) - Detecting and recognizing characters in Greek papyri with YOLOv8, DeiT
and SimCLR [9.7902367664742]
This paper discusses our submission to the ICDAR 2023 Competition on Detection and Recognition of Greek Letters on Papyri'
We used an ensemble of YOLOv8 models to detect and classify individual characters and employed two different approaches for refining the character predictions.
Our submission won the recognition challenge with a mAP of 42.2%, and was runner-up in the detection challenge with a mean average precision (mAP) of 51.4%.
arXiv Detail & Related papers (2024-01-23T06:08:00Z) - Large Language Models Meet Knowledge Graphs to Answer Factoid Questions [57.47634017738877]
We propose a method for exploring pre-trained Text-to-Text Language Models enriched with additional information from Knowledge Graphs.
We procure easily interpreted information with Transformer-based models through the linearization of the extracted subgraphs.
Final re-ranking of the answer candidates with the extracted information boosts Hits@1 scores of the pre-trained text-to-text language models by 4-6%.
arXiv Detail & Related papers (2023-10-03T15:57:00Z) - Uncovering the Handwritten Text in the Margins: End-to-end Handwritten
Text Detection and Recognition [0.840835093659811]
This work presents an end-to-end framework for automatic detection and recognition of handwritten marginalia.
It uses data augmentation and transfer learning to overcome training data scarcity.
The effectiveness of the proposed framework has been empirically evaluated on the data from early book collections found in the Uppsala University Library in Sweden.
arXiv Detail & Related papers (2023-03-10T14:00:53Z) - A Study of Augmentation Methods for Handwritten Stenography Recognition [0.0]
We study 22 classical augmentation techniques, most of which are commonly used for HTR of other scripts.
We identify a group of augmentations, including for example contained ranges of random rotation, shifts and scaling, that are beneficial to the use case of stenography recognition.
arXiv Detail & Related papers (2023-03-05T20:06:19Z) - Reading and Writing: Discriminative and Generative Modeling for
Self-Supervised Text Recognition [101.60244147302197]
We introduce contrastive learning and masked image modeling to learn discrimination and generation of text images.
Our method outperforms previous self-supervised text recognition methods by 10.2%-20.2% on irregular scene text recognition datasets.
Our proposed text recognizer exceeds previous state-of-the-art text recognition methods by averagely 5.3% on 11 benchmarks, with similar model size.
arXiv Detail & Related papers (2022-07-01T03:50:26Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - One-shot Compositional Data Generation for Low Resource Handwritten Text
Recognition [10.473427493876422]
Low resource Handwritten Text Recognition is a hard problem due to the scarce annotated data and the very limited linguistic information.
In this paper we address this problem through a data generation technique based on Bayesian Program Learning.
Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet.
arXiv Detail & Related papers (2021-05-11T18:53:01Z) - Near-imperceptible Neural Linguistic Steganography via Self-Adjusting
Arithmetic Coding [88.31226340759892]
We present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model.
Human evaluations show that 51% of generated cover texts can indeed fool eavesdroppers.
arXiv Detail & Related papers (2020-10-01T20:40:23Z) - Investigating Pretrained Language Models for Graph-to-Text Generation [55.55151069694146]
Graph-to-text generation aims to generate fluent texts from graph-based data.
We present a study across three graph domains: meaning representations, Wikipedia knowledge graphs (KGs) and scientific KGs.
We show that the PLMs BART and T5 achieve new state-of-the-art results and that task-adaptive pretraining strategies improve their performance even further.
arXiv Detail & Related papers (2020-07-16T16:05:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.