Improving Accuracy and Explainability of Online Handwriting Recognition
- URL: http://arxiv.org/abs/2209.09102v1
- Date: Wed, 14 Sep 2022 21:28:14 GMT
- Title: Improving Accuracy and Explainability of Online Handwriting Recognition
- Authors: Hilda Azimi, Steven Chang, Jonathan Gold, Koray Karabina
- Abstract summary: We develop handwriting recognition models on the OnHW-chars dataset and improve the accuracy of previous models.
Our results are verifiable and reproducible via the provided public repository.
- Score: 0.9176056742068814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Handwriting recognition technology allows recognizing a written text from a
given data. The recognition task can target letters, symbols, or words, and the
input data can be a digital image or recorded by various sensors. A wide range
of applications from signature verification to electronic document processing
can be realized by implementing efficient and accurate handwriting recognition
algorithms. Over the years, there has been an increasing interest in
experimenting with different types of technology to collect handwriting data,
create datasets, and develop algorithms to recognize characters and symbols.
More recently, the OnHW-chars dataset has been published that contains
multivariate time series data of the English alphabet collected using a
ballpoint pen fitted with sensors. The authors of OnHW-chars also provided some
baseline results through their machine learning (ML) and deep learning (DL)
classifiers.
In this paper, we develop handwriting recognition models on the OnHW-chars
dataset and improve the accuracy of previous models. More specifically, our ML
models provide $11.3\%$-$23.56\%$ improvements over the previous ML models, and
our optimized DL models with ensemble learning provide $3.08\%$-$7.01\%$
improvements over the previous DL models. In addition to our accuracy
improvements over the spectrum, we aim to provide some level of explainability
for our models to provide more logic behind chosen methods and why the models
make sense for the data type in the dataset. Our results are verifiable and
reproducible via the provided public repository.
Related papers
- On Pre-training of Multimodal Language Models Customized for Chart Understanding [83.99377088129282]
This paper explores the training processes necessary to improve MLLMs' comprehension of charts.
We introduce CHOPINLLM, an MLLM tailored for in-depth chart comprehension.
arXiv Detail & Related papers (2024-07-19T17:58:36Z) - T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text [59.57676466961787]
We propose a novel dynamic vector quantization (DVA-VAE) model that can adjust the encoding length based on the information density in sign language.
Experiments conducted on the PHOENIX14T dataset demonstrate the effectiveness of our proposed method.
We propose a new large German sign language dataset, PHOENIX-News, which contains 486 hours of sign language videos, audio, and transcription texts.
arXiv Detail & Related papers (2024-06-11T10:06:53Z) - Improving Classification Performance With Human Feedback: Label a few,
we label the rest [2.7386128680964408]
This paper focuses on understanding how a continuous feedback loop can refine models, thereby enhancing their accuracy, recall, and precision.
We benchmark this approach on the Financial Phrasebank, Banking, Craigslist, Trec, Amazon Reviews datasets to prove that with just a few labeled examples, we are able to surpass the accuracy of zero shot large language models.
arXiv Detail & Related papers (2024-01-17T19:13:05Z) - Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z) - Self-Supervised Representation Learning for Online Handwriting Text
Classification [0.8594140167290099]
We propose the novel Part of Stroke Masking (POSM) as a pretext task for pretraining models to extract informative representations from the online handwriting of individuals in English and Chinese languages.
To evaluate the quality of the extracted representations, we use both intrinsic and extrinsic evaluation methods.
The pretrained models are fine-tuned to achieve state-of-the-art results in tasks such as writer identification, gender classification, and handedness classification.
arXiv Detail & Related papers (2023-10-10T14:07:49Z) - Offline Detection of Misspelled Handwritten Words by Convolving
Recognition Model Features with Text Labels [0.0]
We introduce the task of comparing a handwriting image to text.
Our model's classification head is trained entirely on synthetic data created using a state-of-the-art generative adversarial network.
Such massive performance gains can lead to significant productivity increases in applications utilizing human-in-the-loop automation.
arXiv Detail & Related papers (2023-09-18T21:13:42Z) - Sampling and Ranking for Digital Ink Generation on a tight computational
budget [69.15275423815461]
We study ways to maximize the quality of the output of a trained digital ink generative model.
We use and compare the effect of multiple sampling and ranking techniques, in the first ablation study of its kind in the digital ink domain.
arXiv Detail & Related papers (2023-06-02T09:55:15Z) - Continuous Offline Handwriting Recognition using Deep Learning Models [0.0]
Handwritten text recognition is an open problem of great interest in the area of automatic document image analysis.
We have proposed a new recognition model based on integrating two types of deep learning architectures: convolutional neural networks (CNN) and sequence-to-sequence (seq2seq)
The new proposed model provides competitive results with those obtained with other well-established methodologies.
arXiv Detail & Related papers (2021-12-26T07:31:03Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - VidLanKD: Improving Language Understanding via Video-Distilled Knowledge
Transfer [76.3906723777229]
We present VidLanKD, a video-language knowledge distillation method for improving language understanding.
We train a multi-modal teacher model on a video-text dataset, and then transfer its knowledge to a student language model with a text dataset.
In our experiments, VidLanKD achieves consistent improvements over text-only language models and vokenization models.
arXiv Detail & Related papers (2021-07-06T15:41:32Z) - Handwritten Digit Recognition using Machine and Deep Learning Algorithms [0.0]
We have performed handwritten digit recognition with the help of MNIST datasets using Support Vector Machines (SVM), Multi-Layer Perceptron (MLP) and Convolution Neural Network (CNN) models.
Our main objective is to compare the accuracy of the models stated above along with their execution time to get the best possible model for digit recognition.
arXiv Detail & Related papers (2021-06-23T18:23:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.