Related papers: FontAdapter: Instant Font Adaptation in Visual Text Generation

FontAdapter: Instant Font Adaptation in Visual Text Generation

URL: http://arxiv.org/abs/2506.05843v1
Date: Fri, 06 Jun 2025 08:00:49 GMT
Title: FontAdapter: Instant Font Adaptation in Visual Text Generation
Authors: Myungkyu Koo, Subin Kim, Sangkyung Kwak, Jaehyun Nam, Seojin Kim, Jinwoo Shin,
Abstract summary: We present FontAdapter, a framework that enables visual text generation in unseen fonts within seconds, conditioned on a reference glyph image.<n>Experiments demonstrate that FontAdapter enables high-quality, robust font customization across unseen fonts without additional fine-tuning during inference.
Score: 45.00544198317519
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-image diffusion models have significantly improved the seamless integration of visual text into diverse image contexts. Recent approaches further improve control over font styles through fine-tuning with predefined font dictionaries. However, adapting unseen fonts outside the preset is computationally expensive, often requiring tens of minutes, making real-time customization impractical. In this paper, we present FontAdapter, a framework that enables visual text generation in unseen fonts within seconds, conditioned on a reference glyph image. To this end, we find that direct training on font datasets fails to capture nuanced font attributes, limiting generalization to new glyphs. To overcome this, we propose a two-stage curriculum learning approach: FontAdapter first learns to extract font attributes from isolated glyphs and then integrates these styles into diverse natural backgrounds. To support this two-stage training scheme, we construct synthetic datasets tailored to each stage, leveraging large-scale online fonts effectively. Experiments demonstrate that FontAdapter enables high-quality, robust font customization across unseen fonts without additional fine-tuning during inference. Furthermore, it supports visual text editing, font style blending, and cross-lingual font transfer, positioning FontAdapter as a versatile framework for font customization tasks.

Related papers

ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations [8.588945675550592]
This work demonstrates that diffusion models can achieve font-controllable multilingual text rendering using just raw images without font label annotations.<n>The experiment provides a proof of concept of our algorithm in zero-shot text and font editing across diverse fonts and languages.
arXiv Detail & Related papers (2025-02-16T05:30:18Z)
JoyType: A Robust Design for Multilingual Visual Text Creation [14.441897362967344]
We introduce a novel approach for multilingual visual text creation, named JoyType.<n>JoyType is designed to maintain the font style of text during the image generation process.<n>Our evaluations, based on both visual and accuracy metrics, demonstrate that JoyType significantly outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2024-09-26T04:23:17Z)
VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement. Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z)
CF-Font: Content Fusion for Few-shot Font Generation [63.79915037830131]
We propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts. Our method also allows to optimize the style representation vector of reference images. We have evaluated our method on a dataset of 300 fonts with 6.5k characters each.
arXiv Detail & Related papers (2023-03-24T14:18:40Z)
Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font. The proposed model aims to generate the entire font library by giving only one sample as the reference. The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z)
Font Representation Learning via Paired-glyph Matching [15.358456947574913]
We propose a novel font representation learning scheme to embed font styles into the latent space. For the discriminative representation of a font from others, we propose a paired-glyph matching-based font representation learning model. We show our font representation learning scheme achieves better generalization performance than the existing font representation learning techniques.
arXiv Detail & Related papers (2022-11-20T12:27:27Z)
Few-Shot Font Generation by Learning Fine-Grained Local Styles [90.39288370855115]
Few-shot font generation (FFG) aims to generate a new font with a few examples. We propose a new font generation approach by learning 1) the fine-grained local styles from references, and 2) the spatial correspondence between the content and reference glyphs.
arXiv Detail & Related papers (2022-05-20T05:07:05Z)
Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction. Our approach enables us to massively scale up the number of character types we can effectively model. We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)
Few-shot Compositional Font Generation with Dual Memory [16.967987801167514]
We propose a novel font generation framework, named Dual Memory-augmented Font Generation Network (DM-Font) We employ memory components and global-context awareness in the generator to take advantage of the compositionality. In the experiments on Korean-handwriting fonts and Thai-printing fonts, we observe that our method generates a significantly better quality of samples with faithful stylization.
arXiv Detail & Related papers (2020-05-21T08:13:40Z)
Attribute2Font: Creating Fonts You Want From Attributes [32.82714291856353]
Attribute2Font is trained to perform font style transfer between any two fonts conditioned on their attribute values. A novel unit named Attribute Attention Module is designed to make those generated glyph images better embody the prominent font attributes.
arXiv Detail & Related papers (2020-05-16T04:06:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.