Look Closer to Supervise Better: One-Shot Font Generation via
Component-Based Discriminator
- URL: http://arxiv.org/abs/2205.00146v1
- Date: Sat, 30 Apr 2022 03:41:49 GMT
- Title: Look Closer to Supervise Better: One-Shot Font Generation via
Component-Based Discriminator
- Authors: Yuxin Kong, Canjie Luo, Weihong Ma, Qiyuan Zhu, Shenggao Zhu, Nicholas
Yuan, Lianwen Jin
- Abstract summary: We propose a novel Component-Aware Module (CAM), which supervises the generator to decouple content and style at a more fine-grained level.
Our approach outperforms state-of-the-art one-shot font generation methods.
It can be applied to handwritten word synthesis and scene text image editing.
- Score: 28.325133809296464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic font generation remains a challenging research issue due to the
large amounts of characters with complicated structures. Typically, only a few
samples can serve as the style/content reference (termed few-shot learning),
which further increases the difficulty to preserve local style patterns or
detailed glyph structures. We investigate the drawbacks of previous studies and
find that a coarse-grained discriminator is insufficient for supervising a font
generator. To this end, we propose a novel Component-Aware Module (CAM), which
supervises the generator to decouple content and style at a more fine-grained
level, \textit{i.e.}, the component level. Different from previous studies
struggling to increase the complexity of generators, we aim to perform more
effective supervision for a relatively simple generator to achieve its full
potential, which is a brand new perspective for font generation. The whole
framework achieves remarkable results by coupling component-level supervision
with adversarial learning, hence we call it Component-Guided GAN, shortly
CG-GAN. Extensive experiments show that our approach outperforms
state-of-the-art one-shot font generation methods. Furthermore, it can be
applied to handwritten word synthesis and scene text image editing, suggesting
the generalization of our approach.
Related papers
- Skeleton and Font Generation Network for Zero-shot Chinese Character Generation [53.08596064763731]
We propose a novel Skeleton and Font Generation Network (SFGN) to achieve a more robust Chinese character font generation.
We conduct experiments on misspelled characters, a substantial portion of which slightly differs from the common ones.
Our approach visually demonstrates the efficacy of generated images and outperforms current state-of-the-art font generation methods.
arXiv Detail & Related papers (2025-01-14T12:15:49Z) - Decoupling Layout from Glyph in Online Chinese Handwriting Generation [6.566541829858544]
We develop a text line layout generator and stylized font synthesizer.
The layout generator performs in-context-like learning based on the text content and the provided style references to generate positions for each glyph autoregressively.
The font synthesizer which consists of a character embedding dictionary, a multi-scale calligraphy style encoder, and a 1D U-Net based diffusion denoiser will generate each font on its position while imitating the calligraphy style extracted from the given style references.
arXiv Detail & Related papers (2024-10-03T08:46:17Z) - Few shot font generation via transferring similarity guided global style
and quantization local style [11.817299400850176]
We present a novel font generation approach by aggregating styles from character similarity-guided global features and stylized component-level representations.
Our AFFG method could obtain a complete set of component-level style representations, and also control the global glyph characteristics.
arXiv Detail & Related papers (2023-09-02T05:05:40Z) - VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z) - Learning Generative Structure Prior for Blind Text Image
Super-resolution [153.05759524358467]
We present a novel prior that focuses more on the character structure.
To restrict the generative space of StyleGAN, we store the discrete features for each character in a codebook.
The proposed structure prior exerts stronger character-specific guidance to restore faithful and precise strokes of a designated character.
arXiv Detail & Related papers (2023-03-26T13:54:28Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - Few-Shot Font Generation by Learning Fine-Grained Local Styles [90.39288370855115]
Few-shot font generation (FFG) aims to generate a new font with a few examples.
We propose a new font generation approach by learning 1) the fine-grained local styles from references, and 2) the spatial correspondence between the content and reference glyphs.
arXiv Detail & Related papers (2022-05-20T05:07:05Z) - XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font
Generation [13.569449355929574]
We propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder.
The encoder is conditioned jointly on the glyph image and the corresponding stroke labels.
It only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task 28% lower than the second best.
arXiv Detail & Related papers (2022-04-11T13:34:40Z) - Few-shot Font Generation with Weakly Supervised Localized
Representations [17.97183447033118]
We propose a novel font generation method that learns localized styles, namely component-wise style representations, instead of universal styles.
Our method shows remarkably better few-shot font generation results (with only eight reference glyphs) than other state-of-the-art methods.
arXiv Detail & Related papers (2021-12-22T14:26:53Z) - DAE-GAN: Dynamic Aspect-aware GAN for Text-to-Image Synthesis [55.788772366325105]
We propose a Dynamic Aspect-awarE GAN (DAE-GAN) that represents text information comprehensively from multiple granularities, including sentence-level, word-level, and aspect-level.
Inspired by human learning behaviors, we develop a novel Aspect-aware Dynamic Re-drawer (ADR) for image refinement, in which an Attended Global Refinement (AGR) module and an Aspect-aware Local Refinement (ALR) module are alternately employed.
arXiv Detail & Related papers (2021-08-27T07:20:34Z) - A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks.
We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z) - DG-Font: Deformable Generative Networks for Unsupervised Font Generation [14.178381391124036]
We propose novel deformable generative networks for unsupervised font generation (DGFont)
We introduce a feature deformation skip connection (FDSC) which predicts pairs of displacement maps and employs the predicted maps to apply deformable convolution to the low-level feature maps from the content encoder.
Experiments demonstrate that our model generates characters in higher quality than state-of-art methods.
arXiv Detail & Related papers (2021-04-07T11:32:32Z) - Few-shot Font Generation with Localized Style Representations and
Factorization [23.781619323447003]
We propose a novel font generation method by learning localized styles, namely component-wise style representations, instead of universal styles.
Our method shows remarkably better few-shot font generation results (with only 8 reference glyph images) than other state-of-the-arts.
arXiv Detail & Related papers (2020-09-23T10:33:01Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.