SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and
Out-of-Vocabulary Text
- URL: http://arxiv.org/abs/2202.11456v1
- Date: Wed, 23 Feb 2022 12:13:27 GMT
- Title: SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and
Out-of-Vocabulary Text
- Authors: Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Zhe Li, Dezhi Peng
- Abstract summary: We propose a novel method that can synthesize parameterized and controllable handwriting Styles for arbitrary-Length and Out-of-vocabulary text.
We embed the text content by providing an easily obtainable printed style image, so that the diversity of the content can be flexibly achieved.
Our method can synthesize words that are not included in the training vocabulary and with various new styles.
- Score: 35.83345711291558
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large amounts of labeled data are urgently required for the training of
robust text recognizers. However, collecting handwriting data of diverse
styles, along with an immense lexicon, is considerably expensive. Although data
synthesis is a promising way to relieve data hunger, two key issues of
handwriting synthesis, namely, style representation and content embedding,
remain unsolved. To this end, we propose a novel method that can synthesize
parameterized and controllable handwriting Styles for arbitrary-Length and
Out-of-vocabulary text based on a Generative Adversarial Network (GAN), termed
SLOGAN. Specifically, we propose a style bank to parameterize the specific
handwriting styles as latent vectors, which are input to a generator as style
priors to achieve the corresponding handwritten styles. The training of the
style bank requires only the writer identification of the source images, rather
than attribute annotations. Moreover, we embed the text content by providing an
easily obtainable printed style image, so that the diversity of the content can
be flexibly achieved by changing the input printed image. Finally, the
generator is guided by dual discriminators to handle both the handwriting
characteristics that appear as separated characters and in a series of cursive
joins. Our method can synthesize words that are not included in the training
vocabulary and with various new styles. Extensive experiments have shown that
high-quality text images with great style diversity and rich vocabulary can be
synthesized using our method, thereby enhancing the robustness of the
recognizer.
Related papers
- Decoupling Layout from Glyph in Online Chinese Handwriting Generation [6.566541829858544]
We develop a text line layout generator and stylized font synthesizer.
The layout generator performs in-context-like learning based on the text content and the provided style references to generate positions for each glyph autoregressively.
The font synthesizer which consists of a character embedding dictionary, a multi-scale calligraphy style encoder, and a 1D U-Net based diffusion denoiser will generate each font on its position while imitating the calligraphy style extracted from the given style references.
arXiv Detail & Related papers (2024-10-03T08:46:17Z) - DiffusionPen: Towards Controlling the Style of Handwritten Text Generation [7.398476020996681]
DiffusionPen (DiffPen) is a 5-shot style handwritten text generation approach based on Latent Diffusion Models.
Our approach captures both textual and stylistic characteristics of seen and unseen words and styles, generating realistic handwritten samples.
Our method outperforms existing methods qualitatively and quantitatively, and its additional generated data can improve the performance of Handwriting Text Recognition (HTR) systems.
arXiv Detail & Related papers (2024-09-09T20:58:25Z) - ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style
Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning.
We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles.
We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z) - Handwritten Text Generation from Visual Archetypes [25.951540903019467]
We devise a Transformer-based model for Few-Shot styled handwritten text generation.
We obtain a robust representation of unseen writers' calligraphy by exploiting specific pre-training on a large synthetic dataset.
arXiv Detail & Related papers (2023-03-27T14:58:20Z) - Disentangling Writer and Character Styles for Handwriting Generation [8.33116145030684]
We present the style-disentangled Transformer (SDT), which employs two complementary contrastive objectives to extract the style commonalities of reference samples.
Our empirical findings reveal that the two learned style representations provide information at different frequency magnitudes.
arXiv Detail & Related papers (2023-03-26T14:32:02Z) - StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized
Tokenizer of a Large-Scale Generative Model [64.26721402514957]
We propose StylerDALLE, a style transfer method that uses natural language to describe abstract art styles.
Specifically, we formulate the language-guided style transfer task as a non-autoregressive token sequence translation.
To incorporate style information, we propose a Reinforcement Learning strategy with CLIP-based language supervision.
arXiv Detail & Related papers (2023-03-16T12:44:44Z) - Content and Style Aware Generation of Text-line Images for Handwriting
Recognition [4.301658883577544]
We propose a generative method for handwritten text-line images conditioned on both visual appearance and textual content.
Our method is able to produce long text-line samples with diverse handwriting styles.
arXiv Detail & Related papers (2022-04-12T05:52:03Z) - Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets [56.018551958004814]
This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources.
Large-scale datasets with noisy image-text pairs provide a sub-optimal source of supervision.
We propose to leverage and separate semantics and descriptive style through the incorporation of a style token and keywords extracted through a retrieval component.
arXiv Detail & Related papers (2021-11-24T19:00:05Z) - Improving Disentangled Text Representation Learning with
Information-Theoretic Guidance [99.68851329919858]
discrete nature of natural language makes disentangling of textual representations more challenging.
Inspired by information theory, we propose a novel method that effectively manifests disentangled representations of text.
Experiments on both conditional text generation and text-style transfer demonstrate the high quality of our disentangled representation.
arXiv Detail & Related papers (2020-06-01T03:36:01Z) - Separating Content from Style Using Adversarial Learning for Recognizing
Text in the Wild [103.51604161298512]
We propose an adversarial learning framework for the generation and recognition of multiple characters in an image.
Our framework can be integrated into recent recognition methods to achieve new state-of-the-art recognition accuracy.
arXiv Detail & Related papers (2020-01-13T12:41:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.