Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
- URL: http://arxiv.org/abs/2407.17229v4
- Date: Fri, 11 Oct 2024 08:48:03 GMT
- Title: Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
- Authors: Wanggong Yang, Yifei Zhao,
- Abstract summary: LPGen is a novel diffusion-based model specifically designed for landscape painting generation.
LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features.
The model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output.
- Score: 2.205829309604458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating high-fidelity landscape paintings remains a challenging task that requires precise control over both structure and style. In this paper, we present LPGen, a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features, effectively mimicking the layered approach of traditional painting techniques. Additionally, LPGen proposes a structural controller, a multi-scale encoder designed to control the layout of landscape paintings, striking a balance between aesthetics and composition. Besides, the model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output. Through extensive evaluations, LPGen demonstrates superior performance in producing paintings that are not only structurally accurate but also stylistically coherent, surpassing current state-of-the-art models. This work advances AI-generated art and offers new avenues for exploring the intersection of technology and traditional artistic practices. Our code, dataset, and model weights will be publicly available.
Related papers
- Training-Free Sketch-Guided Diffusion with Latent Optimization [22.94468603089249]
We propose an innovative training-free pipeline that extends existing text-to-image generation models to incorporate a sketch as an additional condition.
To generate new images with a layout and structure closely resembling the input sketch, we find that these core features of a sketch can be tracked with the cross-attention maps of diffusion models.
We introduce latent optimization, a method that refines the noisy latent at each intermediate step of the generation process.
arXiv Detail & Related papers (2024-08-31T00:44:03Z) - CreativeSynth: Creative Blending and Synthesis of Visual Arts based on
Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
However, adapting these models for artistic image editing presents two significant challenges.
We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z) - ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and
Implicit Style Prompt Bank [9.99530386586636]
Artistic style transfer aims to repaint the content image with the learned artistic style.
Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches.
We propose ArtBank, a novel artistic style transfer framework, to generate highly realistic stylized images.
arXiv Detail & Related papers (2023-12-11T05:53:40Z) - Collaborative Neural Painting [27.880814775833578]
We introduce a novel task, Collaborative Neural Painting (CNP), to facilitate collaborative art painting generation between humans and machines.
CNP should produce a sequence of strokes supporting the completion of a coherent painting.
We propose a painting representation based on a sequence of parametrized strokes, which makes it easy both editing and composition operations.
arXiv Detail & Related papers (2023-12-04T10:45:12Z) - Image Inpainting via Tractable Steering of Diffusion Models [54.13818673257381]
This paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior.
Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs)
We show that our approach can consistently improve the overall quality and semantic coherence of inpainted images with only 10% additional computational overhead.
arXiv Detail & Related papers (2023-11-28T21:14:02Z) - Advancing Urban Renewal: An Automated Approach to Generating Historical
Arcade Facades with Stable Diffusion Models [1.645684081891833]
This study introduces a new methodology for automatically generating images of historical arcade facades.
By classifying and tagging a variety of arcade styles, we have constructed several realistic arcade facade image datasets.
Our approach has demonstrated high levels of precision, authenticity, and diversity in the generated images.
arXiv Detail & Related papers (2023-11-20T08:03:12Z) - Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and
Latent Diffusion [50.59261592343479]
We present Kandinsky1, a novel exploration of latent diffusion architecture.
The proposed model is trained separately to map text embeddings to image embeddings of CLIP.
We also deployed a user-friendly demo system that supports diverse generative modes such as text-to-image generation, image fusion, text and image fusion, image variations generation, and text-guided inpainting/outpainting.
arXiv Detail & Related papers (2023-10-05T12:29:41Z) - CCLAP: Controllable Chinese Landscape Painting Generation via Latent
Diffusion Model [54.74470985388726]
controllable Chinese landscape painting generation method named CCLAP.
Our method achieves state-of-the-art performance, especially in artfully-composed and artistic conception.
arXiv Detail & Related papers (2023-04-09T04:16:28Z) - QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity [94.5479418998225]
We propose a new style transfer framework called QuantArt for high visual-fidelity stylization.
Our framework achieves significantly higher visual fidelity compared with the existing style transfer methods.
arXiv Detail & Related papers (2022-12-20T17:09:53Z) - Compositional Transformers for Scene Generation [13.633811200719627]
We introduce the GANformer2 model, an iterative object-oriented transformer, explored for the task of generative modeling.
We show it achieves state-of-the-art performance in terms of visual quality, diversity and consistency.
Further experiments demonstrate the model's disentanglement and provide a deeper insight into its generative process.
arXiv Detail & Related papers (2021-11-17T08:11:42Z) - Modeling Artistic Workflows for Image Generation and Editing [83.43047077223947]
We propose a generative model that follows a given artistic workflow.
It enables both multi-stage image generation as well as multi-stage image editing of an existing piece of art.
arXiv Detail & Related papers (2020-07-14T17:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.