Neural-Polyptych: Content Controllable Painting Recreation for Diverse Genres
- URL: http://arxiv.org/abs/2409.19690v1
- Date: Sun, 29 Sep 2024 12:46:00 GMT
- Title: Neural-Polyptych: Content Controllable Painting Recreation for Diverse Genres
- Authors: Yiming Zhao, Dewen Guo, Zhouhui Lian, Yue Gao, Jianhong Han, Jie Feng, Guoping Wang, Bingfeng Zhou, Sheng Li,
- Abstract summary: We present a unified framework, Neural-Polyptych, to facilitate the creation of expansive, high-resolution paintings.
We have designed a multi-scale GAN-based architecture to decompose the generation process into two parts.
We validate our approach to diverse genres of both Eastern and Western paintings.
- Score: 30.83874057768352
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To bridge the gap between artists and non-specialists, we present a unified framework, Neural-Polyptych, to facilitate the creation of expansive, high-resolution paintings by seamlessly incorporating interactive hand-drawn sketches with fragments from original paintings. We have designed a multi-scale GAN-based architecture to decompose the generation process into two parts, each responsible for identifying global and local features. To enhance the fidelity of semantic details generated from users' sketched outlines, we introduce a Correspondence Attention module utilizing our Reference Bank strategy. This ensures the creation of high-quality, intricately detailed elements within the artwork. The final result is achieved by carefully blending these local elements while preserving coherent global consistency. Consequently, this methodology enables the production of digital paintings at megapixel scale, accommodating diverse artistic expressions and enabling users to recreate content in a controlled manner. We validate our approach to diverse genres of both Eastern and Western paintings. Applications such as large painting extension, texture shuffling, genre switching, mural art restoration, and recomposition can be successfully based on our framework.
Related papers
- VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model [76.02314305164595]
This work presents a novel image outpainting framework that is capable of customizing the results according to the requirement of users.
We take advantage of a Multimodal Large Language Model (MLLM) that automatically extracts and organizes the corresponding textual descriptions of the masked and unmasked part of a given image.
In addition, a special Cross-Attention module, namely Center-Total-Surrounding (CTS), is elaborately designed to enhance further the the interaction between specific space regions of the image and corresponding parts of the text prompts.
arXiv Detail & Related papers (2024-06-03T07:14:19Z) - CreativeSynth: Creative Blending and Synthesis of Visual Arts based on
Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
However, adapting these models for artistic image editing presents two significant challenges.
We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z) - Segmentation-Based Parametric Painting [22.967620358813214]
We introduce a novel image-to-painting method that facilitates the creation of large-scale, high-fidelity paintings with human-like quality and stylistic variation.
We introduce a segmentation-based painting process and a dynamic attention map approach inspired by human painting strategies.
Our optimized batch processing and patch-based loss framework enable efficient handling of large canvases.
arXiv Detail & Related papers (2023-11-24T04:15:10Z) - Composite Diffusion | whole >= \Sigma parts [0.0]
This paper introduces Composite Diffusion as a means for artists to generate high-quality images by composing from the sub-scenes.
We provide a comprehensive and modular method for Composite Diffusion that enables alternative ways of generating, composing, and harmonizing sub-scenes.
arXiv Detail & Related papers (2023-07-25T17:58:43Z) - CCLAP: Controllable Chinese Landscape Painting Generation via Latent
Diffusion Model [54.74470985388726]
controllable Chinese landscape painting generation method named CCLAP.
Our method achieves state-of-the-art performance, especially in artfully-composed and artistic conception.
arXiv Detail & Related papers (2023-04-09T04:16:28Z) - QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity [94.5479418998225]
We propose a new style transfer framework called QuantArt for high visual-fidelity stylization.
Our framework achieves significantly higher visual fidelity compared with the existing style transfer methods.
arXiv Detail & Related papers (2022-12-20T17:09:53Z) - Adaptively-Realistic Image Generation from Stroke and Sketch with
Diffusion Model [31.652827838300915]
We propose a unified framework supporting a three-dimensional control over the image synthesis from sketches and strokes based on diffusion models.
Our framework achieves state-of-the-art performance while providing flexibility in generating customized images with control over shape, color, and realism.
Our method unleashes applications such as editing on real images, generation with partial sketches and strokes, and multi-domain multi-modal synthesis.
arXiv Detail & Related papers (2022-08-26T13:59:26Z) - Arbitrary Style Transfer with Structure Enhancement by Combining the
Global and Local Loss [51.309905690367835]
We introduce a novel arbitrary style transfer method with structure enhancement by combining the global and local loss.
Experimental results demonstrate that our method can generate higher-quality images with impressive visual effects.
arXiv Detail & Related papers (2022-07-23T07:02:57Z) - In&Out : Diverse Image Outpainting via GAN Inversion [89.84841983778672]
Image outpainting seeks for a semantically consistent extension of the input image beyond its available content.
In this work, we formulate the problem from the perspective of inverting generative adversarial networks.
Our generator renders micro-patches conditioned on their joint latent code as well as their individual positions in the image.
arXiv Detail & Related papers (2021-04-01T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.