Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
- URL: http://arxiv.org/abs/2110.07040v1
- Date: Wed, 13 Oct 2021 21:28:18 GMT
- Title: Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
- Authors: Jen-Hao Rick Chang, Martin Bresler, Youssouf Chherawala, Adrien
Delaye, Thomas Deselaers, Ryan Dixon, Oncel Tuzel
- Abstract summary: We show how a generative model can be used to build a better recognizer through the control of content and style.
We use the framework to optimize data synthesis and demonstrate significant improvement on handwriting recognition over a model trained on real data only.
- Score: 16.62493361545184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we demonstrate how a generative model can be used to build a
better recognizer through the control of content and style. We are building an
online handwriting recognizer from a modest amount of training samples. By
training our controllable handwriting synthesizer on the same data, we can
synthesize handwriting with previously underrepresented content (e.g., URLs and
email addresses) and style (e.g., cursive and slanted). Moreover, we propose a
framework to analyze a recognizer that is trained with a mixture of real and
synthetic training data. We use the framework to optimize data synthesis and
demonstrate significant improvement on handwriting recognition over a model
trained on real data only. Overall, we achieve a 66% reduction in Character
Error Rate.
Related papers
- Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing [71.29488677105127]
Existing scene text recognition (STR) methods struggle to recognize challenging texts, especially for artistic and severely distorted characters.
We propose a contrastive learning-based STR framework by leveraging synthetic and real unlabeled data without any human cost.
Our method achieves SOTA performance (94.7% and 70.9% average accuracy on common benchmarks and Union14M-Benchmark.
arXiv Detail & Related papers (2024-11-23T15:24:47Z) - REInstruct: Building Instruction Data from Unlabeled Corpus [49.82314244648043]
We propose REInstruct, a method to automatically build instruction data from an unlabeled corpus.
By training Llama-7b on a combination of 3k seed data and 32k synthetic data from REInstruct, fine-tuned model achieves a 65.41% win rate on AlpacaEval leaderboard against text-davinci-003.
arXiv Detail & Related papers (2024-08-20T09:05:03Z) - Self-Supervised Representation Learning for Online Handwriting Text
Classification [0.8594140167290099]
We propose the novel Part of Stroke Masking (POSM) as a pretext task for pretraining models to extract informative representations from the online handwriting of individuals in English and Chinese languages.
To evaluate the quality of the extracted representations, we use both intrinsic and extrinsic evaluation methods.
The pretrained models are fine-tuned to achieve state-of-the-art results in tasks such as writer identification, gender classification, and handedness classification.
arXiv Detail & Related papers (2023-10-10T14:07:49Z) - Image Captions are Natural Prompts for Text-to-Image Models [70.30915140413383]
We analyze the relationship between the training effect of synthetic data and the synthetic data distribution induced by prompts.
We propose a simple yet effective method that prompts text-to-image generative models to synthesize more informative and diverse training data.
Our method significantly improves the performance of models trained on synthetic training data.
arXiv Detail & Related papers (2023-07-17T14:38:11Z) - Improving Handwritten OCR with Training Samples Generated by Glyph
Conditional Denoising Diffusion Probabilistic Model [10.239782333441031]
We propose a denoising diffusion probabilistic model (DDPM) to generate training samples.
This model creates mappings between printed characters and handwritten images.
Synthetic images are not always consistent with the glyph conditional images.
We propose a progressive data filtering strategy to add those samples with a high confidence of correctness to the training set.
arXiv Detail & Related papers (2023-05-31T04:18:30Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Reading and Writing: Discriminative and Generative Modeling for
Self-Supervised Text Recognition [101.60244147302197]
We introduce contrastive learning and masked image modeling to learn discrimination and generation of text images.
Our method outperforms previous self-supervised text recognition methods by 10.2%-20.2% on irregular scene text recognition datasets.
Our proposed text recognizer exceeds previous state-of-the-art text recognition methods by averagely 5.3% on 11 benchmarks, with similar model size.
arXiv Detail & Related papers (2022-07-01T03:50:26Z) - Content and Style Aware Generation of Text-line Images for Handwriting
Recognition [4.301658883577544]
We propose a generative method for handwritten text-line images conditioned on both visual appearance and textual content.
Our method is able to produce long text-line samples with diverse handwriting styles.
arXiv Detail & Related papers (2022-04-12T05:52:03Z) - SmartPatch: Improving Handwritten Word Imitation with Patch
Discriminators [67.54204685189255]
We propose SmartPatch, a new technique increasing the performance of current state-of-the-art methods.
We combine the well-known patch loss with information gathered from the parallel trained handwritten text recognition system.
This leads to a more enhanced local discriminator and results in more realistic and higher-quality generated handwritten words.
arXiv Detail & Related papers (2021-05-21T18:34:21Z) - One-shot Compositional Data Generation for Low Resource Handwritten Text
Recognition [10.473427493876422]
Low resource Handwritten Text Recognition is a hard problem due to the scarce annotated data and the very limited linguistic information.
In this paper we address this problem through a data generation technique based on Bayesian Program Learning.
Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet.
arXiv Detail & Related papers (2021-05-11T18:53:01Z) - Spatio-Temporal Handwriting Imitation [11.54523121769666]
Subdividing the process into smaller subtasks makes it possible to imitate someone's handwriting with a high chance to be visually indistinguishable for humans.
We show that also a typical writer identification system can partially be fooled by the created fake handwritings.
arXiv Detail & Related papers (2020-03-24T00:46:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.