Related papers: SE-GAN: Skeleton Enhanced GAN-based Model for Brush Handwriting Font Generation

SE-GAN: Skeleton Enhanced GAN-based Model for Brush Handwriting Font Generation

URL: http://arxiv.org/abs/2204.10484v1
Date: Fri, 22 Apr 2022 03:56:53 GMT
Title: SE-GAN: Skeleton Enhanced GAN-based Model for Brush Handwriting Font Generation
Authors: Shaozu Yuan, Ruixue Liu, Meng Chen, Baoyang Chen, Zhijie Qiu, Xiaodong He
Abstract summary: brush handwriting font generation involves holistic structure changes and complex strokes transfer. We propose a novel GAN-based image translation model by integrating the skeleton information. We also contribute a large-scale brush handwriting font image dataset with six styles and 15,000 high-resolution images.
Score: 17.06759966521758
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous works on font generation mainly focus on the standard print fonts where character's shape is stable and strokes are clearly separated. There is rare research on brush handwriting font generation, which involves holistic structure changes and complex strokes transfer. To address this issue, we propose a novel GAN-based image translation model by integrating the skeleton information. We first extract the skeleton from training images, then design an image encoder and a skeleton encoder to extract corresponding features. A self-attentive refined attention module is devised to guide the model to learn distinctive features between different domains. A skeleton discriminator is involved to first synthesize the skeleton image from the generated image with a pre-trained generator, then to judge its realness to the target one. We also contribute a large-scale brush handwriting font image dataset with six styles and 15,000 high-resolution images. Both quantitative and qualitative experimental results demonstrate the competitiveness of our proposed model.

Related papers

Calligrapher: Freestyle Text Image Customization [72.71919410487881]
Calligrapher is a novel diffusion-based framework that integrates advanced text customization with artistic typography.<n>By automating high-quality, visually consistent typography, Calligrapher surpasses traditional models.
arXiv Detail & Related papers (2025-06-30T17:59:06Z)
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation [38.730628018627975]
This research aims to tackle the generation of text effects for multilingual fonts. We introduce a novel shape-adaptive diffusion model capable of interpreting the given shape. We also present a training-free, shape-adaptive effect transfer method for transferring textures from a generated reference letter to others.
arXiv Detail & Related papers (2024-06-12T16:43:47Z)
DeepCalliFont: Few-shot Chinese Calligraphy Font Synthesis by Integrating Dual-modality Generative Models [20.76773399161289]
Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem. We propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models.
arXiv Detail & Related papers (2023-12-16T04:23:12Z)
UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models [25.219960711604728]
This paper proposes a novel approach for text image generation, utilizing a pre-trained diffusion model. Our approach involves the design and training of a light-weight character-level text encoder, which replaces the original CLIP encoder. By employing an inference stage refinement process, we achieve a notably high sequence accuracy when synthesizing text in arbitrarily given images.
arXiv Detail & Related papers (2023-12-08T07:47:46Z)
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion [50.59261592343479]
We present Kandinsky1, a novel exploration of latent diffusion architecture. The proposed model is trained separately to map text embeddings to image embeddings of CLIP. We also deployed a user-friendly demo system that supports diverse generative modes such as text-to-image generation, image fusion, text and image fusion, image variations generation, and text-guided inpainting/outpainting.
arXiv Detail & Related papers (2023-10-05T12:29:41Z)
DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion [10.75789076591325]
We introduce a novel method to automatically generate an artistic typography by stylizing one or more letter fonts. Our approach utilizes large language models to bridge texts and visual images for stylization and build an unsupervised generative model.
arXiv Detail & Related papers (2023-03-16T19:12:52Z)
Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font. The proposed model aims to generate the entire font library by giving only one sample as the reference. The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z)
eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers [87.52504764677226]
Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. We train an ensemble of text-to-image diffusion models specialized for different stages synthesis. Our ensemble of diffusion models, called eDiffi, results in improved text alignment while maintaining the same inference cost.
arXiv Detail & Related papers (2022-11-02T17:43:04Z)
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation [95.02406834386814]
Parti treats text-to-image generation as a sequence-to-sequence modeling problem. Parti uses a Transformer-based image tokenizer, ViT-VQGAN, to encode images as sequences of discrete tokens. PartiPrompts (P2) is a new holistic benchmark of over 1600 English prompts.
arXiv Detail & Related papers (2022-06-22T01:11:29Z)
Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction. Our approach enables us to massively scale up the number of character types we can effectively model. We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)
Font Completion and Manipulation by Cycling Between Multi-Modality Representations [113.26243126754704]
We innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation. We formulate a cross-modality cycled image-to-image structure with a graph between an image encoder and an image. Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion.
arXiv Detail & Related papers (2021-08-30T02:43:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.