VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization
- URL: http://arxiv.org/abs/2308.14018v1
- Date: Sun, 27 Aug 2023 06:32:20 GMT
- Title: VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization
- Authors: Mingshuai Yao, Yabo Zhang, Xianhui Lin, Xiaoming Li, Wangmeng Zuo
- Abstract summary: We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
- Score: 52.870638830417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot font generation is challenging, as it needs to capture the
fine-grained stroke styles from a limited set of reference glyphs, and then
transfer to other characters, which are expected to have similar styles.
However, due to the diversity and complexity of Chinese font styles, the
synthesized glyphs of existing methods usually exhibit visible artifacts, such
as missing details and distorted strokes. In this paper, we propose a
VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token
prior refinement and structure-aware enhancement. Specifically, we pre-train a
VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font
refines the synthesized glyphs with the codebook to eliminate the domain gap
between synthesized and real-world strokes. Furthermore, our VQ-Font leverages
the inherent design of Chinese characters, where structure components such as
radicals and character components are combined in specific arrangements, to
recalibrate fine-grained styles based on references. This process improves the
matching and fusion of styles at the structure level. Both modules collaborate
to enhance the fidelity of the generated fonts. Experiments on a collected font
dataset show that our VQ-Font outperforms the competing methods both
quantitatively and qualitatively, especially in generating challenging styles.
Related papers
- Decoupling Layout from Glyph in Online Chinese Handwriting Generation [6.566541829858544]
We develop a text line layout generator and stylized font synthesizer.
The layout generator performs in-context-like learning based on the text content and the provided style references to generate positions for each glyph autoregressively.
The font synthesizer which consists of a character embedding dictionary, a multi-scale calligraphy style encoder, and a 1D U-Net based diffusion denoiser will generate each font on its position while imitating the calligraphy style extracted from the given style references.
arXiv Detail & Related papers (2024-10-03T08:46:17Z) - DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by
Integrating Dual-modality Generative Models [20.76773399161289]
Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem.
We propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models.
arXiv Detail & Related papers (2023-12-16T04:23:12Z) - Few shot font generation via transferring similarity guided global style
and quantization local style [11.817299400850176]
We present a novel font generation approach by aggregating styles from character similarity-guided global features and stylized component-level representations.
Our AFFG method could obtain a complete set of component-level style representations, and also control the global glyph characteristics.
arXiv Detail & Related papers (2023-09-02T05:05:40Z) - Learning Generative Structure Prior for Blind Text Image
Super-resolution [153.05759524358467]
We present a novel prior that focuses more on the character structure.
To restrict the generative space of StyleGAN, we store the discrete features for each character in a codebook.
The proposed structure prior exerts stronger character-specific guidance to restore faithful and precise strokes of a designated character.
arXiv Detail & Related papers (2023-03-26T13:54:28Z) - CF-Font: Content Fusion for Few-shot Font Generation [63.79915037830131]
We propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts.
Our method also allows to optimize the style representation vector of reference images.
We have evaluated our method on a dataset of 300 fonts with 6.5k characters each.
arXiv Detail & Related papers (2023-03-24T14:18:40Z) - Few-shot Font Generation by Learning Style Difference and Similarity [84.76381937516356]
We propose a novel font generation approach by learning the Difference between different styles and the Similarity of the same style (DS-Font)
Specifically, we propose a multi-layer style projector for style encoding and realize a distinctive style representation via our proposed Cluster-level Contrastive Style (CCS) loss.
arXiv Detail & Related papers (2023-01-24T13:57:25Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - Few-Shot Font Generation by Learning Fine-Grained Local Styles [90.39288370855115]
Few-shot font generation (FFG) aims to generate a new font with a few examples.
We propose a new font generation approach by learning 1) the fine-grained local styles from references, and 2) the spatial correspondence between the content and reference glyphs.
arXiv Detail & Related papers (2022-05-20T05:07:05Z) - XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font
Generation [13.569449355929574]
We propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder.
The encoder is conditioned jointly on the glyph image and the corresponding stroke labels.
It only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task 28% lower than the second best.
arXiv Detail & Related papers (2022-04-11T13:34:40Z) - Few-shot Font Generation with Localized Style Representations and
Factorization [23.781619323447003]
We propose a novel font generation method by learning localized styles, namely component-wise style representations, instead of universal styles.
Our method shows remarkably better few-shot font generation results (with only 8 reference glyph images) than other state-of-the-arts.
arXiv Detail & Related papers (2020-09-23T10:33:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.