Related papers: StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding

StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding

URL: http://arxiv.org/abs/2012.08687v2
Date: Mon, 11 Jan 2021 01:41:46 GMT
Title: StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding
Authors: Jinshan Zeng, Qi Chen, Yunxin Liu, Mingwen Wang, Yuan Yao
Abstract summary: We introduce a one-bit stroke encoding to capture the key mode information of Chinese characters. We incorporate this mode information into CycleGAN, a popular deep generative model for Chinese font generation. StrokeGAN is generally outperforms the state-of-the-art methods in terms of content and recognition accuracies.
Score: 20.877391644999534
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The generation of stylish Chinese fonts is an important problem involved in many applications. Most of existing generation methods are based on the deep generative models, particularly, the generative adversarial networks (GAN) based models. However, these deep generative models may suffer from the mode collapse issue, which significantly degrades the diversity and quality of generated results. In this paper, we introduce a one-bit stroke encoding to capture the key mode information of Chinese characters and then incorporate it into CycleGAN, a popular deep generative model for Chinese font generation. As a result we propose an efficient method called StrokeGAN, mainly motivated by the observation that the stroke encoding contains amount of mode information of Chinese characters. In order to reconstruct the one-bit stroke encoding of the associated generated characters, we introduce a stroke-encoding reconstruction loss imposed on the discriminator. Equipped with such one-bit stroke encoding and stroke-encoding reconstruction loss, the mode collapse issue of CycleGAN can be significantly alleviated, with an improved preservation of strokes and diversity of generated characters. The effectiveness of StrokeGAN is demonstrated by a series of generation tasks over nine datasets with different fonts. The numerical results demonstrate that StrokeGAN generally outperforms the state-of-the-art methods in terms of content and recognition accuracies, as well as certain stroke error, and also generates more realistic characters.

Related papers

Skeleton and Font Generation Network for Zero-shot Chinese Character Generation [53.08596064763731]
We propose a novel Skeleton and Font Generation Network (SFGN) to achieve a more robust Chinese character font generation. We conduct experiments on misspelled characters, a substantial portion of which slightly differs from the common ones. Our approach visually demonstrates the efficacy of generated images and outperforms current state-of-the-art font generation methods.
arXiv Detail & Related papers (2025-01-14T12:15:49Z)
DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by Integrating Dual-modality Generative Models [20.76773399161289]
Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem. We propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models.
arXiv Detail & Related papers (2023-12-16T04:23:12Z)
VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement. Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z)
DGFont++: Robust Deformable Generative Networks for Unsupervised Font Generation [19.473023811252116]
We propose a robust deformable generative network for unsupervised font generation (abbreviated as DGFont++) To distinguish different styles, we train our model with a multi-task discriminator, which ensures that each style can be discriminated independently. Experiments demonstrate that our model is able to generate character images of higher quality than state-of-the-art methods.
arXiv Detail & Related papers (2022-12-30T14:35:10Z)
Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font. The proposed model aims to generate the entire font library by giving only one sample as the reference. The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z)
SGCE-Font: Skeleton Guided Channel Expansion for Chinese Font Generation [19.20334101519465]
This paper proposes a novel information guidance module called the skeleton guided channel expansion (SGCE) module for the Chinese font generation. Numerical results show that the mode collapse issue suffered by the known CycleGAN can be effectively alleviated by equipping with the proposed SGCE module.
arXiv Detail & Related papers (2022-11-26T04:21:46Z)
Chinese Character Recognition with Radical-Structured Stroke Trees [51.8541677234175]
We represent each Chinese character as a stroke tree, which is organized according to its radical structures. We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder perceives radical structures and radical regions. A Radical-to-Stroke Decoder further predicts the stroke sequences according to the features of radical regions.
arXiv Detail & Related papers (2022-11-24T10:28:55Z)
StrokeGAN+: Few-Shot Semi-Supervised Chinese Font Generation with Stroke Encoding [23.886977380061662]
This paper proposes an effective model called textitGAN+Stroke, which incorporates the stroke encoding and the few-shot semi-supervised scheme into the CycleGAN model. Experimental results show that the mode collapse issue can be effectively alleviated by the introduced one-bit stroke encoding and few-shot semi-supervised training scheme.
arXiv Detail & Related papers (2022-11-11T13:39:26Z)
FontTransformer: Few-shot High-resolution Chinese Glyph Image Synthesis via Stacked Transformers [21.705680113996742]
This paper proposes FontTransformer, a novel few-shot learning model, for high-resolution Chinese glyph image synthesis. We also design a novel encoding scheme to feed more glyph information and prior knowledge to our model.
arXiv Detail & Related papers (2022-10-12T15:09:22Z)
Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction. Our approach enables us to massively scale up the number of character types we can effectively model. We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)
A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks. We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z)
Blind Face Restoration via Deep Multi-scale Component Dictionaries [75.02640809505277]
We propose a deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations. DFDNet generates deep dictionaries for perceptually significant face components from high-quality images. component AdaIN is leveraged to eliminate the style diversity between the input and dictionary features.
arXiv Detail & Related papers (2020-08-02T07:02:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.