HWD: A Novel Evaluation Score for Styled Handwritten Text Generation
- URL: http://arxiv.org/abs/2310.20316v1
- Date: Tue, 31 Oct 2023 09:44:27 GMT
- Title: HWD: A Novel Evaluation Score for Styled Handwritten Text Generation
- Authors: Vittorio Pippi, Fabio Quattrini, Silvia Cascianelli, Rita Cucchiara
- Abstract summary: Styled Handwritten Text Generation (Styled HTG) is an important task in document analysis, aiming to generate text images with the handwriting of given reference images.
We devise the Handwriting Distance (HWD), tailored for HTG evaluation.
In particular, it works in the feature space of a network specifically trained to extract handwriting style features from the variable-lenght input images and exploits a perceptual distance to compare the subtle geometric features of handwriting.
- Score: 36.416044687373535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Styled Handwritten Text Generation (Styled HTG) is an important task in
document analysis, aiming to generate text images with the handwriting of given
reference images. In recent years, there has been significant progress in the
development of deep learning models for tackling this task. Being able to
measure the performance of HTG models via a meaningful and representative
criterion is key for fostering the development of this research topic. However,
despite the current adoption of scores for natural image generation evaluation,
assessing the quality of generated handwriting remains challenging. In light of
this, we devise the Handwriting Distance (HWD), tailored for HTG evaluation. In
particular, it works in the feature space of a network specifically trained to
extract handwriting style features from the variable-lenght input images and
exploits a perceptual distance to compare the subtle geometric features of
handwriting. Through extensive experimental evaluation on different word-level
and line-level datasets of handwritten text images, we demonstrate the
suitability of the proposed HWD as a score for Styled HTG. The pretrained model
used as backbone will be released to ease the adoption of the score, aiming to
provide a valuable tool for evaluating HTG models and thus contributing to
advancing this important research area.
Related papers
- TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models [39.06617653124486]
We introduce a new evaluation framework called TypeScore to assess a model's ability to generate images with high-fidelity embedded text.
Our proposed metric demonstrates greater resolution than CLIPScore to differentiate popular image generation models.
arXiv Detail & Related papers (2024-11-02T07:56:54Z) - DiffusionPen: Towards Controlling the Style of Handwritten Text Generation [7.398476020996681]
DiffusionPen (DiffPen) is a 5-shot style handwritten text generation approach based on Latent Diffusion Models.
Our approach captures both textual and stylistic characteristics of seen and unseen words and styles, generating realistic handwritten samples.
Our method outperforms existing methods qualitatively and quantitatively, and its additional generated data can improve the performance of Handwriting Text Recognition (HTR) systems.
arXiv Detail & Related papers (2024-09-09T20:58:25Z) - Rethinking HTG Evaluation: Bridging Generation and Recognition [7.398476020996681]
We introduce three measures tailored for HTG evaluation, $ textHTG_textstyle $, and $ textHTG_textOOV $.
The metrics rely on the recognition error/accuracy of Handwriting Text Recognition and Writer Identification models.
Our findings show that our metrics are richer in information and underscore the necessity of standardized evaluation protocols in HTG.
arXiv Detail & Related papers (2024-09-04T13:15:10Z) - FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction [66.98008357232428]
We propose FineMatch, a new aspect-based fine-grained text and image matching benchmark.
FineMatch focuses on text and image mismatch detection and correction.
We show that models trained on FineMatch demonstrate enhanced proficiency in detecting fine-grained text and image mismatches.
arXiv Detail & Related papers (2024-04-23T03:42:14Z) - Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z) - How to Choose Pretrained Handwriting Recognition Models for Single
Writer Fine-Tuning [23.274139396706264]
Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on modern and historical manuscripts.
Those models struggle to obtain the same performance when applied to manuscripts with peculiar characteristics, such as language, paper support, ink, and author handwriting.
In this paper, we take into account large, real benchmark datasets and synthetic ones obtained with a styled Handwritten Text Generation model.
We give a quantitative indication of the most relevant characteristics of such data for obtaining an HTR model able to effectively transcribe manuscripts in small collections with as little as five real fine-tuning lines
arXiv Detail & Related papers (2023-05-04T07:00:28Z) - Boosting Modern and Historical Handwritten Text Recognition with
Deformable Convolutions [52.250269529057014]
Handwritten Text Recognition (HTR) in free-volution pages is a challenging image understanding task.
We propose to adopt deformable convolutions, which can deform depending on the input at hand and better adapt to the geometric variations of the text.
arXiv Detail & Related papers (2022-08-17T06:55:54Z) - Improving Generation and Evaluation of Visual Stories via Semantic
Consistency [72.00815192668193]
Given a series of natural language captions, an agent must generate a sequence of images that correspond to the captions.
Prior work has introduced recurrent generative models which outperform synthesis text-to-image models on this task.
We present a number of improvements to prior modeling approaches, including the addition of a dual learning framework.
arXiv Detail & Related papers (2021-05-20T20:42:42Z) - Handwriting Transformers [98.3964093654716]
We propose a transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement and global and local writing style patterns.
The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism.
Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated.
arXiv Detail & Related papers (2021-04-08T17:59:43Z) - Spectral Graph-based Features for Recognition of Handwritten Characters:
A Case Study on Handwritten Devanagari Numerals [0.0]
We propose an approach that exploits the robust graph representation and spectral graph embedding concept to represent handwritten characters.
For corroboration of the efficacy of the proposed method, extensive experiments were carried out on the standard handwritten numeral Computer Vision Pattern Recognition, Unit of Indian Statistical Institute Kolkata dataset.
arXiv Detail & Related papers (2020-07-07T08:40:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.