DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by
Integrating Dual-modality Generative Models
- URL: http://arxiv.org/abs/2312.10314v1
- Date: Sat, 16 Dec 2023 04:23:12 GMT
- Title: DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by
Integrating Dual-modality Generative Models
- Authors: Yitian Liu, Zhouhui Lian
- Abstract summary: Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem.
We propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models.
- Score: 20.76773399161289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot font generation, especially for Chinese calligraphy fonts, is a
challenging and ongoing problem. With the help of prior knowledge that is
mainly based on glyph consistency assumptions, some recently proposed methods
can synthesize high-quality Chinese glyph images. However, glyphs in
calligraphy font styles often do not meet these assumptions. To address this
problem, we propose a novel model, DeepCalliFont, for few-shot Chinese
calligraphy font synthesis by integrating dual-modality generative models.
Specifically, the proposed model consists of image synthesis and sequence
generation branches, generating consistent results via a dual-modality
representation learning strategy. The two modalities (i.e., glyph images and
writing sequences) are properly integrated using a feature recombination module
and a rasterization loss function. Furthermore, a new pre-training strategy is
adopted to improve the performance by exploiting large amounts of uni-modality
data. Both qualitative and quantitative experiments have been conducted to
demonstrate the superiority of our method to other state-of-the-art approaches
in the task of few-shot Chinese calligraphy font synthesis. The source code can
be found at https://github.com/lsflyt-pku/DeepCalliFont.
Related papers
- HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution [17.977410216055024]
We introduce HFH-Font, a few-shot font synthesis method capable of efficiently generating high-resolution glyph images.
For the first time, large-scale Chinese vector fonts of a quality comparable to those manually created by professional font designers can be automatically generated.
arXiv Detail & Related papers (2024-10-09T02:30:24Z) - VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z) - GlyphDiffusion: Text Generation as Image Generation [100.98428068214736]
We propose GlyphDiffusion, a novel diffusion approach for text generation via text-guided image generation.
Our key idea is to render the target text as a glyph image containing visual language content.
Our model also makes significant improvements compared to the recent diffusion model.
arXiv Detail & Related papers (2023-04-25T02:14:44Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font
Generation [13.569449355929574]
We propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder.
The encoder is conditioned jointly on the glyph image and the corresponding stroke labels.
It only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task 28% lower than the second best.
arXiv Detail & Related papers (2022-04-11T13:34:40Z) - DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality
Learning [21.123297001902177]
We propose a novel method, DeepVecFont, to generate visually-pleasing vector glyphs.
The highlights of this paper are threefold. First, we design a dual-modality learning strategy which utilizes both image-aspect and sequence-aspect features of fonts to synthesize vector glyphs.
Second, we provide a new generative paradigm to handle unstructured data (e.g., vector glyphs) by randomly sampling plausible results to get the optimal one which is further refined under the guidance of generated structured data.
arXiv Detail & Related papers (2021-10-13T12:57:19Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - Font Completion and Manipulation by Cycling Between Multi-Modality
Representations [113.26243126754704]
We innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation.
We formulate a cross-modality cycled image-to-image structure with a graph between an image encoder and an image.
Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion.
arXiv Detail & Related papers (2021-08-30T02:43:29Z) - A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks.
We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z) - Few-Shot Font Generation with Deep Metric Learning [33.12829580813688]
The proposed framework introduces deep metric learning to style encoders.
We performed experiments using black-and-white and shape-distinctive font datasets.
arXiv Detail & Related papers (2020-11-04T10:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.