Related papers: DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by Integrating Dual-modality Generative Models

DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by Integrating Dual-modality Generative Models

URL: http://arxiv.org/abs/2312.10314v1
Date: Sat, 16 Dec 2023 04:23:12 GMT
Title: DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by Integrating Dual-modality Generative Models
Authors: Yitian Liu, Zhouhui Lian
Abstract summary: Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem. We propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models.
Score: 20.76773399161289
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem. With the help of prior knowledge that is mainly based on glyph consistency assumptions, some recently proposed methods can synthesize high-quality Chinese glyph images. However, glyphs in calligraphy font styles often do not meet these assumptions. To address this problem, we propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models. Specifically, the proposed model consists of image synthesis and sequence generation branches, generating consistent results via a dual-modality representation learning strategy. The two modalities (i.e., glyph images and writing sequences) are properly integrated using a feature recombination module and a rasterization loss function. Furthermore, a new pre-training strategy is adopted to improve the performance by exploiting large amounts of uni-modality data. Both qualitative and quantitative experiments have been conducted to demonstrate the superiority of our method to other state-of-the-art approaches in the task of few-shot Chinese calligraphy font synthesis. The source code can be found at https://github.com/lsflyt-pku/DeepCalliFont.

Related papers

Skeleton and Font Generation Network for Zero-shot Chinese Character Generation [53.08596064763731]
We propose a novel Skeleton and Font Generation Network (SFGN) to achieve a more robust Chinese character font generation. We conduct experiments on misspelled characters, a substantial portion of which slightly differs from the common ones. Our approach visually demonstrates the efficacy of generated images and outperforms current state-of-the-art font generation methods.
arXiv Detail & Related papers (2025-01-14T12:15:49Z)
HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution [17.977410216055024]
We introduce HFH-Font, a few-shot font synthesis method capable of efficiently generating high-resolution glyph images. For the first time, large-scale Chinese vector fonts of a quality comparable to those manually created by professional font designers can be automatically generated.
arXiv Detail & Related papers (2024-10-09T02:30:24Z)
VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement. Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z)
GlyphDiffusion: Text Generation as Image Generation [100.98428068214736]
We propose GlyphDiffusion, a novel diffusion approach for text generation via text-guided image generation. Our key idea is to render the target text as a glyph image containing visual language content. Our model also makes significant improvements compared to the recent diffusion model.
arXiv Detail & Related papers (2023-04-25T02:14:44Z)
Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font. The proposed model aims to generate the entire font library by giving only one sample as the reference. The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z)
XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation [13.569449355929574]
We propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder. The encoder is conditioned jointly on the glyph image and the corresponding stroke labels. It only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task 28% lower than the second best.
arXiv Detail & Related papers (2022-04-11T13:34:40Z)
DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning [21.123297001902177]
We propose a novel method, DeepVecFont, to generate visually-pleasing vector glyphs. The highlights of this paper are threefold. First, we design a dual-modality learning strategy which utilizes both image-aspect and sequence-aspect features of fonts to synthesize vector glyphs. Second, we provide a new generative paradigm to handle unstructured data (e.g., vector glyphs) by randomly sampling plausible results to get the optimal one which is further refined under the guidance of generated structured data.
arXiv Detail & Related papers (2021-10-13T12:57:19Z)
Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction. Our approach enables us to massively scale up the number of character types we can effectively model. We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)
Font Completion and Manipulation by Cycling Between Multi-Modality Representations [113.26243126754704]
We innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation. We formulate a cross-modality cycled image-to-image structure with a graph between an image encoder and an image. Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion.
arXiv Detail & Related papers (2021-08-30T02:43:29Z)
A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks. We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z)
Few-Shot Font Generation with Deep Metric Learning [33.12829580813688]
The proposed framework introduces deep metric learning to style encoders. We performed experiments using black-and-white and shape-distinctive font datasets.
arXiv Detail & Related papers (2020-11-04T10:12:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.