FontDiffuser: One-Shot Font Generation via Denoising Diffusion with
Multi-Scale Content Aggregation and Style Contrastive Learning
- URL: http://arxiv.org/abs/2312.12142v1
- Date: Tue, 19 Dec 2023 13:23:20 GMT
- Title: FontDiffuser: One-Shot Font Generation via Denoising Diffusion with
Multi-Scale Content Aggregation and Style Contrastive Learning
- Authors: Zhenhua Yang, Dezhi Peng, Yuxin Kong, Yuyi Zhang, Cong Yao, Lianwen
Jin
- Abstract summary: FontDiffuser is a diffusion-based image-to-image one-shot font generation method.
It consistently excels on complex characters and large style changes compared to previous methods.
- Score: 45.696909070215476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic font generation is an imitation task, which aims to create a font
library that mimics the style of reference images while preserving the content
from source images. Although existing font generation methods have achieved
satisfactory performance, they still struggle with complex characters and large
style variations. To address these issues, we propose FontDiffuser, a
diffusion-based image-to-image one-shot font generation method, which
innovatively models the font imitation task as a noise-to-denoise paradigm. In
our method, we introduce a Multi-scale Content Aggregation (MCA) block, which
effectively combines global and local content cues across different scales,
leading to enhanced preservation of intricate strokes of complex characters.
Moreover, to better manage the large variations in style transfer, we propose a
Style Contrastive Refinement (SCR) module, which is a novel structure for style
representation learning. It utilizes a style extractor to disentangle styles
from images, subsequently supervising the diffusion model via a meticulously
designed style contrastive loss. Extensive experiments demonstrate
FontDiffuser's state-of-the-art performance in generating diverse characters
and styles. It consistently excels on complex characters and large style
changes compared to previous methods. The code is available at
https://github.com/yeungchenwa/FontDiffuser.
Related papers
- JoyType: A Robust Design for Multilingual Visual Text Creation [14.441897362967344]
We introduce a novel approach for multilingual visual text creation, named JoyType.
JoyType is designed to maintain the font style of text during the image generation process.
Our evaluations, based on both visual and accuracy metrics, demonstrate that JoyType significantly outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2024-09-26T04:23:17Z) - CF-Font: Content Fusion for Few-shot Font Generation [63.79915037830131]
We propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts.
Our method also allows to optimize the style representation vector of reference images.
We have evaluated our method on a dataset of 300 fonts with 6.5k characters each.
arXiv Detail & Related papers (2023-03-24T14:18:40Z) - Few-shot Font Generation by Learning Style Difference and Similarity [84.76381937516356]
We propose a novel font generation approach by learning the Difference between different styles and the Similarity of the same style (DS-Font)
Specifically, we propose a multi-layer style projector for style encoding and realize a distinctive style representation via our proposed Cluster-level Contrastive Style (CCS) loss.
arXiv Detail & Related papers (2023-01-24T13:57:25Z) - DGFont++: Robust Deformable Generative Networks for Unsupervised Font
Generation [19.473023811252116]
We propose a robust deformable generative network for unsupervised font generation (abbreviated as DGFont++)
To distinguish different styles, we train our model with a multi-task discriminator, which ensures that each style can be discriminated independently.
Experiments demonstrate that our model is able to generate character images of higher quality than state-of-the-art methods.
arXiv Detail & Related papers (2022-12-30T14:35:10Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - DiffStyler: Controllable Dual Diffusion for Text-Driven Image
Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results.
We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font
Generation [13.569449355929574]
We propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder.
The encoder is conditioned jointly on the glyph image and the corresponding stroke labels.
It only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task 28% lower than the second best.
arXiv Detail & Related papers (2022-04-11T13:34:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.