Realistic Handwritten Multi-Digit Writer (MDW) Number Recognition Challenges
- URL: http://arxiv.org/abs/2512.00676v1
- Date: Sun, 30 Nov 2025 00:13:34 GMT
- Title: Realistic Handwritten Multi-Digit Writer (MDW) Number Recognition Challenges
- Authors: Kiri L. Wagstaff,
- Abstract summary: Isolated digit classification has served as a motivating problem for decades of machine learning research.<n>In real settings, numbers often occur as multiple digits, all written by the same person.<n>In this work, we leverage knowledge about the writers of NIST digit images to create more realistic benchmark multi-digit writer (MDW) data sets.
- Score: 1.5939955861266888
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Isolated digit classification has served as a motivating problem for decades of machine learning research. In real settings, numbers often occur as multiple digits, all written by the same person. Examples include ZIP Codes, handwritten check amounts, and appointment times. In this work, we leverage knowledge about the writers of NIST digit images to create more realistic benchmark multi-digit writer (MDW) data sets. As expected, we find that classifiers may perform well on isolated digits yet do poorly on multi-digit number recognition. If we want to solve real number recognition problems, additional advances are needed. The MDW benchmarks come with task-specific performance metrics that go beyond typical error calculations to more closely align with real-world impact. They also create opportunities to develop methods that can leverage task-specific knowledge to improve performance well beyond that of individual digit classification methods.
Related papers
- Handwritten Digit Recognition: An Ensemble-Based Approach for Superior Performance [9.174021241188143]
This paper presents an ensemble-based approach that combines Convolutional Neural Networks (CNNs) with traditional machine learning techniques to improve recognition accuracy and robustness.<n>We evaluate our method on the MNIST dataset, comprising 70,000 handwritten digit images.<n>Our hybrid model, which uses CNNs for feature extraction and Support Vector Machines (SVMs) for classification, achieves an accuracy of 99.30%.
arXiv Detail & Related papers (2025-03-08T07:09:49Z) - FoNE: Precise Single-Token Number Embeddings via Fourier Features [51.17846016593835]
We propose a novel method that maps numbers into the embedding space with their Fourier features.<n>FoNE encodes each number as a single token with only two embedding dimensions per digit, effectively capturing numerical values without fragmentation.<n>On 6-digit decimal addition, FoNE requires 64$times$ less data to achieve 99% accuracy than subword and digit-wise embeddings.<n>FoNE is the only method that yields 100% accuracy on over 100,000 test examples for addition, subtraction, and multiplication.
arXiv Detail & Related papers (2025-02-13T19:54:59Z) - Transformers Can Do Arithmetic with the Right Embeddings [75.66545271398704]
We show how to improve the performance of transformers on arithmetic tasks.<n>We find that training on only 20 digit numbers with a single GPU for one day, we can reach state-of-the-art performance.<n>These gains in numeracy also unlock improvements on other multi-step reasoning tasks including sorting and multiplication.
arXiv Detail & Related papers (2024-05-27T17:49:18Z) - Deep Learning-Driven Approach for Handwritten Chinese Character Classification [0.0]
Handwritten character recognition is a challenging problem for machine learning researchers.
With numerous unique character classes present, some data, such as Logographic Scripts or Sino-Korean character sequences, bring new complications to the HCR problem.
This paper proposes a highly scalable approach for detailed character image classification by introducing the model architecture, data preprocessing steps, and testing design instructions.
arXiv Detail & Related papers (2024-01-30T15:29:32Z) - Identifying and Analyzing Performance-Critical Tokens in Large Language Models [52.404072802235234]
We study how large language models learn to perform tasks from demonstrations.<n>Our work sheds light on how large language models learn to perform tasks from demonstrations and deepens our understanding of the roles different types of tokens play in large language models.
arXiv Detail & Related papers (2024-01-20T20:55:21Z) - Positional Description Matters for Transformers Arithmetic [58.4739272381373]
Transformers often falter on arithmetic tasks despite their vast capabilities.
We propose several ways to fix the issue, either by modifying the positional encoding directly, or by modifying the representation of the arithmetic task to leverage standard positional encoding differently.
arXiv Detail & Related papers (2023-11-22T00:31:01Z) - Sampling and Ranking for Digital Ink Generation on a tight computational
budget [69.15275423815461]
We study ways to maximize the quality of the output of a trained digital ink generative model.
We use and compare the effect of multiple sampling and ranking techniques, in the first ablation study of its kind in the digital ink domain.
arXiv Detail & Related papers (2023-06-02T09:55:15Z) - FERMAT: An Alternative to Accuracy for Numerical Reasoning [11.893004722079557]
numerical reasoning is measured using a single score on existing datasets.
We introduce a multi-view evaluation set for numerical reasoning in English, called FERMAT.
FerMAT evaluates models on various key numerical reasoning aspects such as number understanding, mathematical operations, and training dependency.
arXiv Detail & Related papers (2023-05-27T15:00:45Z) - Number Entity Recognition [65.80137628972312]
Numbers are essential components of text, like any other word tokens, from which natural language processing (NLP) models are built and deployed.
In this work, we attempt to tap this potential of state-of-the-art NLP models and transfer their ability to boost performance in related tasks.
Our proposed classification of numbers into entities helps NLP models perform well on several tasks, including a handcrafted Fill-In-The-Blank (FITB) task and on question answering using joint embeddings.
arXiv Detail & Related papers (2022-05-07T05:22:43Z) - NumGPT: Improving Numeracy Ability of Generative Pre-trained Models [59.931394234642816]
We propose NumGPT, a generative pre-trained model that explicitly models the numerical properties of numbers in texts.
Specifically, it leverages a prototype-based numeral embedding to encode the mantissa of the number and an individual embedding to encode the exponent of the number.
A numeral-aware loss function is designed to integrate numerals into the pre-training objective of NumGPT.
arXiv Detail & Related papers (2021-09-07T15:06:12Z) - Multi-script Handwritten Digit Recognition Using Multi-task Learning [2.8698937226234795]
It is not very common for multi-script digit recognition which encourage the development of robust and multipurpose systems.
In this study multi-script handwritten digit recognition using multi-task learning will be investigated.
The handwritten digits of three scripts including Latin, Arabic and Kannada are studied to show that multi-task models with reformulation of the individual tasks have shown promising results.
arXiv Detail & Related papers (2021-06-15T16:30:37Z) - Digit Recognition Using Convolution Neural Network [0.0]
This paper aims to extract a correct feature so that it can achieve better accuracy for recognition of digits.
The applications of digit recognition such as in password, bank check process, etc. to recognize the valid user identification.
The main objective of this work is to obtain highest accuracy 99.15% by using convolution neural network (CNN)
arXiv Detail & Related papers (2020-04-01T10:41:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.