StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke
Encoding
- URL: http://arxiv.org/abs/2012.08687v2
- Date: Mon, 11 Jan 2021 01:41:46 GMT
- Title: StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke
Encoding
- Authors: Jinshan Zeng, Qi Chen, Yunxin Liu, Mingwen Wang, Yuan Yao
- Abstract summary: We introduce a one-bit stroke encoding to capture the key mode information of Chinese characters.
We incorporate this mode information into CycleGAN, a popular deep generative model for Chinese font generation.
StrokeGAN is generally outperforms the state-of-the-art methods in terms of content and recognition accuracies.
- Score: 20.877391644999534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The generation of stylish Chinese fonts is an important problem involved in
many applications. Most of existing generation methods are based on the deep
generative models, particularly, the generative adversarial networks (GAN)
based models. However, these deep generative models may suffer from the mode
collapse issue, which significantly degrades the diversity and quality of
generated results. In this paper, we introduce a one-bit stroke encoding to
capture the key mode information of Chinese characters and then incorporate it
into CycleGAN, a popular deep generative model for Chinese font generation. As
a result we propose an efficient method called StrokeGAN, mainly motivated by
the observation that the stroke encoding contains amount of mode information of
Chinese characters. In order to reconstruct the one-bit stroke encoding of the
associated generated characters, we introduce a stroke-encoding reconstruction
loss imposed on the discriminator. Equipped with such one-bit stroke encoding
and stroke-encoding reconstruction loss, the mode collapse issue of CycleGAN
can be significantly alleviated, with an improved preservation of strokes and
diversity of generated characters. The effectiveness of StrokeGAN is
demonstrated by a series of generation tasks over nine datasets with different
fonts. The numerical results demonstrate that StrokeGAN generally outperforms
the state-of-the-art methods in terms of content and recognition accuracies, as
well as certain stroke error, and also generates more realistic characters.
Related papers
- Skeleton and Font Generation Network for Zero-shot Chinese Character Generation [53.08596064763731]
We propose a novel Skeleton and Font Generation Network (SFGN) to achieve a more robust Chinese character font generation.
We conduct experiments on misspelled characters, a substantial portion of which slightly differs from the common ones.
Our approach visually demonstrates the efficacy of generated images and outperforms current state-of-the-art font generation methods.
arXiv Detail & Related papers (2025-01-14T12:15:49Z) - DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by
Integrating Dual-modality Generative Models [20.76773399161289]
Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem.
We propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models.
arXiv Detail & Related papers (2023-12-16T04:23:12Z) - VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - SGCE-Font: Skeleton Guided Channel Expansion for Chinese Font Generation [19.20334101519465]
This paper proposes a novel information guidance module called the skeleton guided channel expansion (SGCE) module for the Chinese font generation.
Numerical results show that the mode collapse issue suffered by the known CycleGAN can be effectively alleviated by equipping with the proposed SGCE module.
arXiv Detail & Related papers (2022-11-26T04:21:46Z) - Chinese Character Recognition with Radical-Structured Stroke Trees [51.8541677234175]
We represent each Chinese character as a stroke tree, which is organized according to its radical structures.
We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder perceives radical structures and radical regions.
A Radical-to-Stroke Decoder further predicts the stroke sequences according to the features of radical regions.
arXiv Detail & Related papers (2022-11-24T10:28:55Z) - StrokeGAN+: Few-Shot Semi-Supervised Chinese Font Generation with Stroke
Encoding [23.886977380061662]
This paper proposes an effective model called textitGAN+Stroke, which incorporates the stroke encoding and the few-shot semi-supervised scheme into the CycleGAN model.
Experimental results show that the mode collapse issue can be effectively alleviated by the introduced one-bit stroke encoding and few-shot semi-supervised training scheme.
arXiv Detail & Related papers (2022-11-11T13:39:26Z) - FontTransformer: Few-shot High-resolution Chinese Glyph Image Synthesis
via Stacked Transformers [21.705680113996742]
This paper proposes FontTransformer, a novel few-shot learning model, for high-resolution Chinese glyph image synthesis.
We also design a novel encoding scheme to feed more glyph information and prior knowledge to our model.
arXiv Detail & Related papers (2022-10-12T15:09:22Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks.
We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.