Diff-Font: Diffusion Model for Robust One-Shot Font Generation
- URL: http://arxiv.org/abs/2212.05895v3
- Date: Sun, 7 May 2023 15:37:56 GMT
- Title: Diff-Font: Diffusion Model for Robust One-Shot Font Generation
- Authors: Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, Dacheng Tao,
Yu Qiao
- Abstract summary: We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
- Score: 110.45944936952309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Font generation is a difficult and time-consuming task, especially in those
languages using ideograms that have complicated structures with a large number
of characters, such as Chinese. To solve this problem, few-shot font generation
and even one-shot font generation have attracted a lot of attention. However,
most existing font generation methods may still suffer from (i) large
cross-font gap challenge; (ii) subtle cross-font variation problem; and (iii)
incorrect generation of complicated characters. In this paper, we propose a
novel one-shot font generation method based on a diffusion model, named
Diff-Font, which can be stably trained on large datasets. The proposed model
aims to generate the entire font library by giving only one sample as the
reference. Specifically, a large stroke-wise dataset is constructed, and a
stroke-wise diffusion model is proposed to preserve the structure and the
completion of each generated character. To our best knowledge, the proposed
Diff-Font is the first work that developed diffusion models to handle the font
generation task. The well-trained Diff-Font is not only robust to font gap and
font variation, but also achieved promising performance on difficult character
generation. Compared to previous font generation methods, our model reaches
state-of-the-art performance both qualitatively and quantitatively.
Related papers
- DiffCJK: Conditional Diffusion Model for High-Quality and Wide-coverage CJK Character Generation [1.0044057719679087]
We propose a novel diffusion method for generating glyphs in a targeted style from a single conditioned, standard glyph form.
Our approach shows remarkable zero-shot generalization capabilities for non-CJK but Chinese-inspired scripts.
In summary, our proposed method opens the door to high-quality, generative model-assisted font creation for CJK characters.
arXiv Detail & Related papers (2024-04-08T05:58:07Z) - FontDiffuser: One-Shot Font Generation via Denoising Diffusion with
Multi-Scale Content Aggregation and Style Contrastive Learning [45.696909070215476]
FontDiffuser is a diffusion-based image-to-image one-shot font generation method.
It consistently excels on complex characters and large style changes compared to previous methods.
arXiv Detail & Related papers (2023-12-19T13:23:20Z) - DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by
Integrating Dual-modality Generative Models [20.76773399161289]
Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem.
We propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models.
arXiv Detail & Related papers (2023-12-16T04:23:12Z) - VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z) - GlyphDiffusion: Text Generation as Image Generation [100.98428068214736]
We propose GlyphDiffusion, a novel diffusion approach for text generation via text-guided image generation.
Our key idea is to render the target text as a glyph image containing visual language content.
Our model also makes significant improvements compared to the recent diffusion model.
arXiv Detail & Related papers (2023-04-25T02:14:44Z) - CF-Font: Content Fusion for Few-shot Font Generation [63.79915037830131]
We propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts.
Our method also allows to optimize the style representation vector of reference images.
We have evaluated our method on a dataset of 300 fonts with 6.5k characters each.
arXiv Detail & Related papers (2023-03-24T14:18:40Z) - XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font
Generation [13.569449355929574]
We propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder.
The encoder is conditioned jointly on the glyph image and the corresponding stroke labels.
It only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task 28% lower than the second best.
arXiv Detail & Related papers (2022-04-11T13:34:40Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - DG-Font: Deformable Generative Networks for Unsupervised Font Generation [14.178381391124036]
We propose novel deformable generative networks for unsupervised font generation (DGFont)
We introduce a feature deformation skip connection (FDSC) which predicts pairs of displacement maps and employs the predicted maps to apply deformable convolution to the low-level feature maps from the content encoder.
Experiments demonstrate that our model generates characters in higher quality than state-of-art methods.
arXiv Detail & Related papers (2021-04-07T11:32:32Z) - Few-shot Compositional Font Generation with Dual Memory [16.967987801167514]
We propose a novel font generation framework, named Dual Memory-augmented Font Generation Network (DM-Font)
We employ memory components and global-context awareness in the generator to take advantage of the compositionality.
In the experiments on Korean-handwriting fonts and Thai-printing fonts, we observe that our method generates a significantly better quality of samples with faithful stylization.
arXiv Detail & Related papers (2020-05-21T08:13:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.