Related papers: Handwritten Text Generation from Visual Archetypes

Handwritten Text Generation from Visual Archetypes

URL: http://arxiv.org/abs/2303.15269v1
Date: Mon, 27 Mar 2023 14:58:20 GMT
Title: Handwritten Text Generation from Visual Archetypes
Authors: Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara
Abstract summary: We devise a Transformer-based model for Few-Shot styled handwritten text generation. We obtain a robust representation of unseen writers' calligraphy by exploiting specific pre-training on a large synthetic dataset.
Score: 25.951540903019467
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Generating synthetic images of handwritten text in a writer-specific style is a challenging task, especially in the case of unseen styles and new words, and even more when these latter contain characters that are rarely encountered during training. While emulating a writer's style has been recently addressed by generative models, the generalization towards rare characters has been disregarded. In this work, we devise a Transformer-based model for Few-Shot styled handwritten text generation and focus on obtaining a robust and informative representation of both the text and the style. In particular, we propose a novel representation of the textual content as a sequence of dense vectors obtained from images of symbols written as standard GNU Unifont glyphs, which can be considered their visual archetypes. This strategy is more suitable for generating characters that, despite having been seen rarely during training, possibly share visual details with the frequently observed ones. As for the style, we obtain a robust representation of unseen writers' calligraphy by exploiting specific pre-training on a large synthetic dataset. Quantitative and qualitative results demonstrate the effectiveness of our proposal in generating words in unseen styles and with rare characters more faithfully than existing approaches relying on independent one-hot encodings of the characters.

Related papers

Zero-Shot Styled Text Image Generation, but Make It Autoregressive [34.09957000751439]
Styled Handwritten Text Generation (HTG) has recently received attention from the computer vision and document analysis communities. We propose a novel framework for text image generation, dubbed Emuru. Our approach leverages a powerful text image representation model (a variational autoencoder) combined with an autoregressive Transformer.
arXiv Detail & Related papers (2025-03-21T11:56:20Z)
Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting [71.29100512700064]
We present T-Prompter, a training-free method for theme-specific image generation. T-Prompter integrates reference images into generative models, allowing users to seamlessly specify the target theme. Our approach enables consistent story generation, character design, realistic character generation, and style-guided image generation.
arXiv Detail & Related papers (2025-01-26T19:01:19Z)
Learning to Generate Text in Arbitrary Writing Styles [6.7308816341849695]
It is desirable for language models to produce text in an author-specific style on the basis of a potentially small writing sample. We propose to guide a language model to generate text in a target style using contrastively-trained representations that capture stylometric features.
arXiv Detail & Related papers (2023-12-28T18:58:52Z)
Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences [49.66987347397398]
Few-Shot Stylized Visual Captioning aims to generate captions in any desired style, using only a few examples as guidance during inference. We propose a framework called FS-StyleCap for this task, which utilizes a conditional encoder-decoder language model and a visual projection module.
arXiv Detail & Related papers (2023-07-31T04:26:01Z)
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models [70.86603627188519]
We focus on a novel, yet challenging task of generating a coherent image sequence based on a given storyline, denoted as open-ended visual storytelling. We propose a learning-based auto-regressive image generation model, termed as StoryGen, with a novel vision-language context module. We show StoryGen can generalize to unseen characters without any optimization, and generate image sequences with coherent content and consistent character.
arXiv Detail & Related papers (2023-06-01T17:58:50Z)
Disentangling Writer and Character Styles for Handwriting Generation [8.33116145030684]
We present the style-disentangled Transformer (SDT), which employs two complementary contrastive objectives to extract the style commonalities of reference samples. Our empirical findings reveal that the two learned style representations provide information at different frequency magnitudes.
arXiv Detail & Related papers (2023-03-26T14:32:02Z)
Learning Generative Structure Prior for Blind Text Image Super-resolution [153.05759524358467]
We present a novel prior that focuses more on the character structure. To restrict the generative space of StyleGAN, we store the discrete features for each character in a codebook. The proposed structure prior exerts stronger character-specific guidance to restore faithful and precise strokes of a designated character.
arXiv Detail & Related papers (2023-03-26T13:54:28Z)
Character-Aware Models Improve Visual Text Rendering [57.19915686282047]
Current image generation models struggle to reliably produce well-formed visual text. Character-aware models provide large gains on a novel spelling task. Our models set a much higher state-of-the-art on visual spelling, with 30+ point accuracy gains over competitors on rare words.
arXiv Detail & Related papers (2022-12-20T18:59:23Z)
Content and Style Aware Generation of Text-line Images for Handwriting Recognition [4.301658883577544]
We propose a generative method for handwritten text-line images conditioned on both visual appearance and textual content. Our method is able to produce long text-line samples with diverse handwriting styles.
arXiv Detail & Related papers (2022-04-12T05:52:03Z)
SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and Out-of-Vocabulary Text [35.83345711291558]
We propose a novel method that can synthesize parameterized and controllable handwriting Styles for arbitrary-Length and Out-of-vocabulary text. We embed the text content by providing an easily obtainable printed style image, so that the diversity of the content can be flexibly achieved. Our method can synthesize words that are not included in the training vocabulary and with various new styles.
arXiv Detail & Related papers (2022-02-23T12:13:27Z)
Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction. Our approach enables us to massively scale up the number of character types we can effectively model. We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)
ZiGAN: Fine-grained Chinese Calligraphy Font Generation via a Few-shot Style Transfer Approach [7.318027179922774]
ZiGAN is a powerful end-to-end Chinese calligraphy font generation framework. It does not require any manual operation or redundant preprocessing to generate fine-grained target-style characters. Our method has a state-of-the-art generalization ability in few-shot Chinese character style transfer.
arXiv Detail & Related papers (2021-08-08T09:50:20Z)
Handwriting Transformers [98.3964093654716]
We propose a transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement and global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated.
arXiv Detail & Related papers (2021-04-08T17:59:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.