Moyun: A Diffusion-Based Model for Style-Specific Chinese Calligraphy Generation
- URL: http://arxiv.org/abs/2410.07618v1
- Date: Thu, 10 Oct 2024 05:14:03 GMT
- Title: Moyun: A Diffusion-Based Model for Style-Specific Chinese Calligraphy Generation
- Authors: Kaiyuan Liu, Jiahao Mei, Hengyu Zhang, Yihuai Zhang, Xingjiao Wu, Daoguo Dong, Liang He,
- Abstract summary: 'Moyun' can effectively control the generation process and produce calligraphy in the specified style.
Even for calligraphy the calligrapher has not written, 'Moyun' can generate calligraphy that matches the style of the calligrapher.
- Score: 10.7430517947254
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Although Chinese calligraphy generation has achieved style transfer, generating calligraphy by specifying the calligrapher, font, and character style remains challenging. To address this, we propose a new Chinese calligraphy generation model 'Moyun' , which replaces the Unet in the Diffusion model with Vision Mamba and introduces the TripleLabel control mechanism to achieve controllable calligraphy generation. The model was tested on our large-scale dataset 'Mobao' of over 1.9 million images, and the results demonstrate that 'Moyun' can effectively control the generation process and produce calligraphy in the specified style. Even for calligraphy the calligrapher has not written, 'Moyun' can generate calligraphy that matches the style of the calligrapher.
Related papers
- Empowering Backbone Models for Visual Text Generation with Input Granularity Control and Glyph-Aware Training [68.41837295318152]
Diffusion-based text-to-image models have demonstrated impressive achievements in diversity and aesthetics but struggle to generate images with visual texts.
Existing backbone models have limitations such as misspelling, failing to generate texts, and lack of support for Chinese text.
We propose a series of methods, aiming to empower backbone models to generate visual texts in English and Chinese.
arXiv Detail & Related papers (2024-10-06T10:25:39Z) - Style Generation in Robot Calligraphy with Deep Generative Adversarial
Networks [15.199472080437527]
The number of Chinese characters is tens of thousands, which leads to difficulties in the generation of a style consistent Chinese calligraphic font with over 6000 characters.
This paper proposes an automatic calligraphy generation model based on deep generative adversarial networks (deepGAN) that can generate style calligraphy fonts with professional standards.
arXiv Detail & Related papers (2023-12-15T10:35:30Z) - CalliPaint: Chinese Calligraphy Inpainting with Diffusion Model [17.857394263321538]
We introduce a new model that harnesses recent advancements in both Chinese calligraphy generation and image inpainting.
We demonstrate that our proposed model CalliPaint can produce convincing Chinese calligraphy.
arXiv Detail & Related papers (2023-12-03T23:29:59Z) - Calliffusion: Chinese Calligraphy Generation and Style Transfer with
Diffusion Modeling [1.856334276134661]
We propose Calliffusion, a system for generating high-quality Chinese calligraphy using diffusion models.
Our model architecture is based on DDPM (Denoising Diffusion Probabilistic Models)
arXiv Detail & Related papers (2023-05-30T15:34:45Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - ShufaNet: Classification method for calligraphers who have reached the
professional level [0.0]
We propose a novel method, ShufaNet, to classify Chinese calligraphers' styles based on metric learning.
Our method achieved 65% accuracy rate in our data set for few-shot learning, surpassing resNet and other mainstream CNNs.
arXiv Detail & Related papers (2021-11-22T16:55:31Z) - Learning to Generate Scene Graph from Natural Language Supervision [52.18175340725455]
We propose one of the first methods that learn from image-sentence pairs to extract a graphical representation of localized objects and their relationships within an image, known as scene graph.
We leverage an off-the-shelf object detector to identify and localize object instances, match labels of detected regions to concepts parsed from captions, and thus create "pseudo" labels for learning scene graph.
arXiv Detail & Related papers (2021-09-06T03:38:52Z) - Font Completion and Manipulation by Cycling Between Multi-Modality
Representations [113.26243126754704]
We innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation.
We formulate a cross-modality cycled image-to-image structure with a graph between an image encoder and an image.
Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion.
arXiv Detail & Related papers (2021-08-30T02:43:29Z) - Handwriting Transformers [98.3964093654716]
We propose a transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement and global and local writing style patterns.
The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism.
Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated.
arXiv Detail & Related papers (2021-04-08T17:59:43Z) - Comprehensive Image Captioning via Scene Graph Decomposition [51.660090468384375]
We address the challenging problem of image captioning by revisiting the representation of image scene graph.
At the core of our method lies the decomposition of a scene graph into a set of sub-graphs.
We design a deep model to select important sub-graphs, and to decode each selected sub-graph into a single target sentence.
arXiv Detail & Related papers (2020-07-23T00:59:21Z) - CalliGAN: Style and Structure-aware Chinese Calligraphy Character
Generator [6.440233787863018]
Chinese calligraphy is the writing of Chinese characters as an art form performed with brushes.
Recent studies show that Chinese characters can be generated through image-to-image translation for multiple styles using a single model.
We propose a novel method of this approach by incorporating Chinese characters' component information into its model.
arXiv Detail & Related papers (2020-05-26T03:15:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.