Calligrapher: Freestyle Text Image Customization
- URL: http://arxiv.org/abs/2506.24123v1
- Date: Mon, 30 Jun 2025 17:59:06 GMT
- Title: Calligrapher: Freestyle Text Image Customization
- Authors: Yue Ma, Qingyan Bai, Hao Ouyang, Ka Leong Cheng, Qiuyu Wang, Hongyu Liu, Zichen Liu, Haofan Wang, Jingye Chen, Yujun Shen, Qifeng Chen,
- Abstract summary: Calligrapher is a novel diffusion-based framework that integrates advanced text customization with artistic typography.<n>By automating high-quality, visually consistent typography, Calligrapher surpasses traditional models.
- Score: 72.71919410487881
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce Calligrapher, a novel diffusion-based framework that innovatively integrates advanced text customization with artistic typography for digital calligraphy and design applications. Addressing the challenges of precise style control and data dependency in typographic customization, our framework incorporates three key technical contributions. First, we develop a self-distillation mechanism that leverages the pre-trained text-to-image generative model itself alongside the large language model to automatically construct a style-centric typography benchmark. Second, we introduce a localized style injection framework via a trainable style encoder, which comprises both Qformer and linear layers, to extract robust style features from reference images. An in-context generation mechanism is also employed to directly embed reference images into the denoising process, further enhancing the refined alignment of target styles. Extensive quantitative and qualitative evaluations across diverse fonts and design contexts confirm Calligrapher's accurate reproduction of intricate stylistic details and precise glyph positioning. By automating high-quality, visually consistent typography, Calligrapher surpasses traditional models, empowering creative practitioners in digital art, branding, and contextual typographic design.
Related papers
- WordCraft: Interactive Artistic Typography with Attention Awareness and Noise Blending [12.655120187133779]
Artistic typography aims to stylize input characters with visual effects that are both creative and legible.<n>Traditional approaches rely heavily on manual design, while recent generative models, particularly diffusion-based methods, have enabled automated character stylization.<n>We introduce WordCraft, an interactive artistic typography system that integrates diffusion models to address these limitations.
arXiv Detail & Related papers (2025-07-13T10:49:09Z) - DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models [115.62816053600085]
We present DesignDiffusion, a framework for synthesizing design images from textual descriptions.<n>The proposed framework directly synthesizes textual and visual design elements from user prompts.<n>It utilizes a distinctive character embedding derived from the visual text to enhance the input prompt.
arXiv Detail & Related papers (2025-03-03T15:22:57Z) - VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models [53.59400446543756]
We introduce a dual-branch and training-free method, namely VitaGlyph, to enable flexible artistic typography.
VitaGlyph treats input character as a scene composed of Subject and Surrounding, followed by rendering them under varying degrees of geometry transformation.
Experimental results demonstrate that VitaGlyph not only achieves better artistry and readability, but also manages to depict multiple customize concepts.
arXiv Detail & Related papers (2024-10-02T16:48:47Z) - Ada-adapter:Fast Few-shot Style Personlization of Diffusion Model with Pre-trained Image Encoder [57.574544285878794]
Ada-Adapter is a novel framework for few-shot style personalization of diffusion models.
Our method enables efficient zero-shot style transfer utilizing a single reference image.
We demonstrate the effectiveness of our approach on various artistic styles, including flat art, 3D rendering, and logo design.
arXiv Detail & Related papers (2024-07-08T02:00:17Z) - Few-shot Calligraphy Style Learning [0.0]
"Presidifussion" is a novel approach to learning and replicating the unique style of calligraphy of President Xu.
We introduce innovative techniques of font image conditioning and stroke information conditioning, enabling the model to capture the intricate structural elements of Chinese characters.
This work not only presents a breakthrough in the digital preservation of calligraphic art but also sets a new standard for data-efficient generative modeling in the domain of cultural heritage digitization.
arXiv Detail & Related papers (2024-04-26T07:17:09Z) - StyleForge: Enhancing Text-to-Image Synthesis for Any Artistic Styles with Dual Binding [7.291687946822539]
We introduce Single-StyleForge, a novel approach for personalized text-to-image synthesis across diverse artistic styles.
We also present Multi-StyleForge, which enhances image quality and text alignment by binding multiple tokens to partial style attributes.
arXiv Detail & Related papers (2024-04-08T07:43:23Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - ControlStyle: Text-Driven Stylized Image Generation Using Diffusion
Priors [105.37795139586075]
We propose a new task for stylizing'' text-to-image models, namely text-driven stylized image generation.
We present a new diffusion model (ControlStyle) via upgrading a pre-trained text-to-image model with a trainable modulation network.
Experiments demonstrate the effectiveness of our ControlStyle in producing more visually pleasing and artistic results.
arXiv Detail & Related papers (2023-11-09T15:50:52Z) - Towards Diverse and Consistent Typography Generation [15.300255326619203]
We formulate typography generation as a fine-grained attribute generation for multiple text elements.
We build an autoregressive model to generate diverse typography that matches the input design context.
arXiv Detail & Related papers (2023-09-05T10:08:11Z) - GenText: Unsupervised Artistic Text Generation via Decoupled Font and
Texture Manipulation [30.654807125764965]
We propose a novel approach, namely GenText, to achieve general artistic text style transfer.
Specifically, our work incorporates three different stages, stylization, destylization, and font transfer.
Considering the difficult data acquisition of paired artistic text images, our model is designed under the unsupervised setting.
arXiv Detail & Related papers (2022-07-20T04:42:47Z) - SE-GAN: Skeleton Enhanced GAN-based Model for Brush Handwriting Font
Generation [17.06759966521758]
brush handwriting font generation involves holistic structure changes and complex strokes transfer.
We propose a novel GAN-based image translation model by integrating the skeleton information.
We also contribute a large-scale brush handwriting font image dataset with six styles and 15,000 high-resolution images.
arXiv Detail & Related papers (2022-04-22T03:56:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.