Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
- URL: http://arxiv.org/abs/2407.17229v4
- Date: Fri, 11 Oct 2024 08:48:03 GMT
- Title: Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
- Authors: Wanggong Yang, Yifei Zhao,
- Abstract summary: LPGen is a novel diffusion-based model specifically designed for landscape painting generation.
LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features.
The model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output.
- Score: 2.205829309604458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating high-fidelity landscape paintings remains a challenging task that requires precise control over both structure and style. In this paper, we present LPGen, a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features, effectively mimicking the layered approach of traditional painting techniques. Additionally, LPGen proposes a structural controller, a multi-scale encoder designed to control the layout of landscape paintings, striking a balance between aesthetics and composition. Besides, the model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output. Through extensive evaluations, LPGen demonstrates superior performance in producing paintings that are not only structurally accurate but also stylistically coherent, surpassing current state-of-the-art models. This work advances AI-generated art and offers new avenues for exploring the intersection of technology and traditional artistic practices. Our code, dataset, and model weights will be publicly available.
Related papers
- Personalized Image Generation with Deep Generative Models: A Decade Survey [51.26287478042516]
We present a review of generalized personalized image generation across various generative models.
We first define a unified framework that standardizes the personalization process across different generative models.
We then provide an in-depth analysis of personalization techniques within each generative model, highlighting their unique contributions and innovations.
arXiv Detail & Related papers (2025-02-18T17:34:04Z) - LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model [8.938617090786494]
We present LineArt, a framework that transfers complex appearance onto detailed design drawings.
It generates high-fidelity appearance while preserving structural accuracy by simulating hierarchical visual cognition.
It requires no precise 3D modeling, physical property specs, or network training, making it more convenient for design tasks.
arXiv Detail & Related papers (2024-12-16T07:54:45Z) - A Tiered GAN Approach for Monet-Style Image Generation [0.562479170374811]
This paper introduces a tiered GAN model to progressively refine image quality through a multi-stage process.
The model transforms random noise into detailed artistic representations, addressing common challenges such as instability in training, mode collapse, and output quality.
arXiv Detail & Related papers (2024-12-07T19:10:29Z) - Diffusion Models with Anisotropic Gaussian Splatting for Image Inpainting [0.0]
We propose a novel inpainting method that combines diffusion models with anisotropic Gaussian splatting to capture both local structures and global context effectively.
Our method outperforms state-of-the-art techniques, producing visually plausible results with enhanced structural integrity and texture realism.
arXiv Detail & Related papers (2024-12-02T16:29:06Z) - Training-Free Sketch-Guided Diffusion with Latent Optimization [22.94468603089249]
We propose an innovative training-free pipeline that extends existing text-to-image generation models to incorporate a sketch as an additional condition.
To generate new images with a layout and structure closely resembling the input sketch, we find that these core features of a sketch can be tracked with the cross-attention maps of diffusion models.
We introduce latent optimization, a method that refines the noisy latent at each intermediate step of the generation process.
arXiv Detail & Related papers (2024-08-31T00:44:03Z) - CreativeSynth: Creative Blending and Synthesis of Visual Arts based on
Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
However, adapting these models for artistic image editing presents two significant challenges.
We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z) - ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and
Implicit Style Prompt Bank [9.99530386586636]
Artistic style transfer aims to repaint the content image with the learned artistic style.
Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches.
We propose ArtBank, a novel artistic style transfer framework, to generate highly realistic stylized images.
arXiv Detail & Related papers (2023-12-11T05:53:40Z) - Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and
Latent Diffusion [50.59261592343479]
We present Kandinsky1, a novel exploration of latent diffusion architecture.
The proposed model is trained separately to map text embeddings to image embeddings of CLIP.
We also deployed a user-friendly demo system that supports diverse generative modes such as text-to-image generation, image fusion, text and image fusion, image variations generation, and text-guided inpainting/outpainting.
arXiv Detail & Related papers (2023-10-05T12:29:41Z) - CCLAP: Controllable Chinese Landscape Painting Generation via Latent
Diffusion Model [54.74470985388726]
controllable Chinese landscape painting generation method named CCLAP.
Our method achieves state-of-the-art performance, especially in artfully-composed and artistic conception.
arXiv Detail & Related papers (2023-04-09T04:16:28Z) - QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity [94.5479418998225]
We propose a new style transfer framework called QuantArt for high visual-fidelity stylization.
Our framework achieves significantly higher visual fidelity compared with the existing style transfer methods.
arXiv Detail & Related papers (2022-12-20T17:09:53Z) - Modeling Artistic Workflows for Image Generation and Editing [83.43047077223947]
We propose a generative model that follows a given artistic workflow.
It enables both multi-stage image generation as well as multi-stage image editing of an existing piece of art.
arXiv Detail & Related papers (2020-07-14T17:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.