Benchmarking Online Sequence-to-Sequence and Character-based Handwriting
Recognition from IMU-Enhanced Pens
- URL: http://arxiv.org/abs/2202.07036v1
- Date: Mon, 14 Feb 2022 20:55:33 GMT
- Title: Benchmarking Online Sequence-to-Sequence and Character-based Handwriting
Recognition from IMU-Enhanced Pens
- Authors: Felix Ott and David R\"ugamer and Lucas Heublein and Tim Hamann and
Jens Barth and Bernd Bischl and Christopher Mutschler
- Abstract summary: This paper presents data and benchmark models for real-time sequence-to-sequence learning and single character-based recognition.
Data is recorded by a sensor-enhanced ballpoint pen, accelerometers, a magnetometer and a force sensor at 100Hz.
We propose a variety of datasets including equations and words for both the writer-dependent tasks.
- Score: 2.840092825973023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Handwriting is one of the most frequently occurring patterns in everyday life
and with it come challenging applications such as handwriting recognition
(HWR), writer identification, and signature verification. In contrast to
offline HWR that only uses spatial information (i.e., images), online HWR
(OnHWR) uses richer spatio-temporal information (i.e., trajectory data or
inertial data). While there exist many offline HWR datasets, there is only
little data available for the development of OnHWR methods as it requires
hardware-integrated pens. This paper presents data and benchmark models for
real-time sequence-to-sequence (seq2seq) learning and single character-based
recognition. Our data is recorded by a sensor-enhanced ballpoint pen, yielding
sensor data streams from triaxial accelerometers, a gyroscope, a magnetometer
and a force sensor at 100Hz. We propose a variety of datasets including
equations and words for both the writer-dependent and writer-independent tasks.
We provide an evaluation benchmark for seq2seq and single character-based HWR
using recurrent and temporal convolutional networks and Transformers combined
with a connectionist temporal classification (CTC) loss and cross entropy
losses. Our methods do not resort to language or lexicon models.
Related papers
- General Detection-based Text Line Recognition [15.761142324480165]
We introduce a general detection-based approach to text line recognition, be it printed (OCR) or handwritten (HTR)
Our approach builds on a completely different paradigm than state-of-the-art HTR methods, which rely on autoregressive decoding.
We improve state-of-the-art performances for Chinese script recognition on the CASIA v2 dataset, and for cipher recognition on the Borg and Copiale datasets.
arXiv Detail & Related papers (2024-09-25T17:05:55Z) - FastMAC: Stochastic Spectral Sampling of Correspondence Graph [55.75524096647733]
We present the first study that introduces graph signal processing into the domain of correspondence graph.
We exploit the generalized degree signal on correspondence graph and pursue sampling strategies that preserve high-frequency components.
As an application, we build a complete 3D registration algorithm termed FastMAC, that reaches real-time speed.
arXiv Detail & Related papers (2024-03-13T17:59:56Z) - Offline Detection of Misspelled Handwritten Words by Convolving
Recognition Model Features with Text Labels [0.0]
We introduce the task of comparing a handwriting image to text.
Our model's classification head is trained entirely on synthetic data created using a state-of-the-art generative adversarial network.
Such massive performance gains can lead to significant productivity increases in applications utilizing human-in-the-loop automation.
arXiv Detail & Related papers (2023-09-18T21:13:42Z) - Context Perception Parallel Decoder for Scene Text Recognition [52.620841341333524]
Scene text recognition methods have struggled to attain high accuracy and fast inference speed.
We present an empirical study of AR decoding in STR, and discover that the AR decoder not only models linguistic context, but also provides guidance on visual context perception.
We construct a series of CPPD models and also plug the proposed modules into existing STR decoders. Experiments on both English and Chinese benchmarks demonstrate that the CPPD models achieve highly competitive accuracy while running approximately 8x faster than their AR-based counterparts.
arXiv Detail & Related papers (2023-07-23T09:04:13Z) - End-to-End Page-Level Assessment of Handwritten Text Recognition [69.55992406968495]
HTR systems increasingly face the end-to-end page-level transcription of a document.
Standard metrics do not take into account the inconsistencies that might appear.
We propose a two-fold evaluation, where the transcription accuracy and the RO goodness are considered separately.
arXiv Detail & Related papers (2023-01-14T15:43:07Z) - When Counting Meets HMER: Counting-Aware Network for Handwritten
Mathematical Expression Recognition [57.51793420986745]
We propose an unconventional network for handwritten mathematical expression recognition (HMER) named Counting-Aware Network (CAN)
We design a weakly-supervised counting module that can predict the number of each symbol class without the symbol-level position annotations.
Experiments on the benchmark datasets for HMER validate that both joint optimization and counting results are beneficial for correcting the prediction errors of encoder-decoder models.
arXiv Detail & Related papers (2022-07-23T08:39:32Z) - Auxiliary Cross-Modal Representation Learning with Triplet Loss
Functions for Online Handwriting Recognition [3.071136270246468]
Cross-modal representation learning learns a shared embedding between two or more modalities to improve performance in a given task.
We present a triplet loss with a dynamic margin for single label and sequence-to-sequence classification tasks.
Our experiments show an improved classification accuracy, faster convergence, and better generalizability due to an improved cross-modal representation.
arXiv Detail & Related papers (2022-02-16T07:09:04Z) - Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.
A main challenge is that a person often writes a letter in different styles from time to time.
We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z) - Digitizing Handwriting with a Sensor Pen: A Writer-Independent
Recognizer [0.2580765958706854]
This paper presents a writer-independent system that recognizes characters written on plain paper with the use of a sensor-equipped pen.
The pen provides linear acceleration, angular velocity, magnetic field, and force applied by the user, and acts as a digitizer that transforms the analogue signals of the sensors into time data while writing on regular paper.
We present the results of a convolutional neural network model for letter classification and show that this approach is practical and achieves promising results for writer-independent character recognition.
arXiv Detail & Related papers (2021-07-08T09:25:59Z) - Towards an IMU-based Pen Online Handwriting Recognizer [2.6707647984082357]
We present a online handwriting recognition system for word recognition based on inertial measurement units (IMUs)
This is obtained by means of a sensor-equipped pen that provides acceleration, angular velocity, and magnetic forces streamed via Bluetooth.
Our model combines convolutional and bidirectional LSTM networks, and is trained with the Connectionist Temporal Classification loss.
arXiv Detail & Related papers (2021-05-26T09:47:19Z) - Rethinking Text Line Recognition Models [57.47147190119394]
We consider two decoder families (Connectionist Temporal Classification and Transformer) and three encoder modules (Bidirectional LSTMs, Self-Attention, and GRCLs)
We compare their accuracy and performance on widely used public datasets of scene and handwritten text.
Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length.
arXiv Detail & Related papers (2021-04-15T21:43:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.