Skeleton and Font Generation Network for Zero-shot Chinese Character Generation
- URL: http://arxiv.org/abs/2501.08062v1
- Date: Tue, 14 Jan 2025 12:15:49 GMT
- Title: Skeleton and Font Generation Network for Zero-shot Chinese Character Generation
- Authors: Mobai Xue, Jun Du, Zhenrong Zhang, Jiefeng Ma, Qikai Chang, Pengfei Hu, Jianshu Zhang, Yu Hu,
- Abstract summary: We propose a novel Skeleton and Font Generation Network (SFGN) to achieve a more robust Chinese character font generation.
We conduct experiments on misspelled characters, a substantial portion of which slightly differs from the common ones.
Our approach visually demonstrates the efficacy of generated images and outperforms current state-of-the-art font generation methods.
- Score: 53.08596064763731
- License:
- Abstract: Automatic font generation remains a challenging research issue, primarily due to the vast number of Chinese characters, each with unique and intricate structures. Our investigation of previous studies reveals inherent bias capable of causing structural changes in characters. Specifically, when generating a Chinese character similar to, but different from, those in the training samples, the bias is prone to either correcting or ignoring these subtle variations. To address this concern, we propose a novel Skeleton and Font Generation Network (SFGN) to achieve a more robust Chinese character font generation. Our approach includes a skeleton builder and font generator. The skeleton builder synthesizes content features using low-resource text input, enabling our technique to realize font generation independently of content image inputs. Unlike previous font generation methods that treat font style as a global embedding, we introduce a font generator to align content and style features on the radical level, which is a brand-new perspective for font generation. Except for common characters, we also conduct experiments on misspelled characters, a substantial portion of which slightly differs from the common ones. Our approach visually demonstrates the efficacy of generated images and outperforms current state-of-the-art font generation methods. Moreover, we believe that misspelled character generation have significant pedagogical implications and verify such supposition through experiments. We used generated misspelled characters as data augmentation in Chinese character error correction tasks, simulating the scenario where students learn handwritten Chinese characters with the help of misspelled characters. The significantly improved performance of error correction tasks demonstrates the effectiveness of our proposed approach and the value of misspelled character generation.
Related papers
- Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through
Image-IDS Aligning [61.34060587461462]
We propose a two-stage framework for Chinese Text Recognition (CTR)
We pre-train a CLIP-like model through aligning printed character images and Ideographic Description Sequences (IDS)
This pre-training stage simulates humans recognizing Chinese characters and obtains the canonical representation of each character.
The learned representations are employed to supervise the CTR model, such that traditional single-character recognition can be improved to text-line recognition.
arXiv Detail & Related papers (2023-09-03T05:33:16Z) - VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z) - Learning Generative Structure Prior for Blind Text Image
Super-resolution [153.05759524358467]
We present a novel prior that focuses more on the character structure.
To restrict the generative space of StyleGAN, we store the discrete features for each character in a codebook.
The proposed structure prior exerts stronger character-specific guidance to restore faithful and precise strokes of a designated character.
arXiv Detail & Related papers (2023-03-26T13:54:28Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - SVG Vector Font Generation for Chinese Characters with Transformer [42.46279506573065]
We propose a novel network architecture with Transformer and loss functions to capture structural features without differentiable rendering.
Although the dataset range was still limited to the sans-serif family, we successfully generated the Chinese vector font for the first time.
arXiv Detail & Related papers (2022-06-21T12:51:19Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - ZiGAN: Fine-grained Chinese Calligraphy Font Generation via a Few-shot
Style Transfer Approach [7.318027179922774]
ZiGAN is a powerful end-to-end Chinese calligraphy font generation framework.
It does not require any manual operation or redundant preprocessing to generate fine-grained target-style characters.
Our method has a state-of-the-art generalization ability in few-shot Chinese character style transfer.
arXiv Detail & Related papers (2021-08-08T09:50:20Z) - StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke
Encoding [20.877391644999534]
We introduce a one-bit stroke encoding to capture the key mode information of Chinese characters.
We incorporate this mode information into CycleGAN, a popular deep generative model for Chinese font generation.
StrokeGAN is generally outperforms the state-of-the-art methods in terms of content and recognition accuracies.
arXiv Detail & Related papers (2020-12-16T01:36:19Z) - Few-Shot Font Generation with Deep Metric Learning [33.12829580813688]
The proposed framework introduces deep metric learning to style encoders.
We performed experiments using black-and-white and shape-distinctive font datasets.
arXiv Detail & Related papers (2020-11-04T10:12:10Z) - CalliGAN: Style and Structure-aware Chinese Calligraphy Character
Generator [6.440233787863018]
Chinese calligraphy is the writing of Chinese characters as an art form performed with brushes.
Recent studies show that Chinese characters can be generated through image-to-image translation for multiple styles using a single model.
We propose a novel method of this approach by incorporating Chinese characters' component information into its model.
arXiv Detail & Related papers (2020-05-26T03:15:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.