Related papers: Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis

Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis

URL: http://arxiv.org/abs/2407.17229v4
Date: Fri, 11 Oct 2024 08:48:03 GMT
Title: Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
Authors: Wanggong Yang, Yifei Zhao,
Abstract summary: LPGen is a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features. The model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output.
Score: 2.205829309604458
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating high-fidelity landscape paintings remains a challenging task that requires precise control over both structure and style. In this paper, we present LPGen, a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features, effectively mimicking the layered approach of traditional painting techniques. Additionally, LPGen proposes a structural controller, a multi-scale encoder designed to control the layout of landscape paintings, striking a balance between aesthetics and composition. Besides, the model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output. Through extensive evaluations, LPGen demonstrates superior performance in producing paintings that are not only structurally accurate but also stylistically coherent, surpassing current state-of-the-art models. This work advances AI-generated art and offers new avenues for exploring the intersection of technology and traditional artistic practices. Our code, dataset, and model weights will be publicly available.

Related papers

Calligrapher: Freestyle Text Image Customization [72.71919410487881]
Calligrapher is a novel diffusion-based framework that integrates advanced text customization with artistic typography.<n>By automating high-quality, visually consistent typography, Calligrapher surpasses traditional models.
arXiv Detail & Related papers (2025-06-30T17:59:06Z)
Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation [54.588082888166504]
We present Mogao, a unified framework that enables interleaved multi-modal generation through a causal approach.<n>Mogoo integrates a set of key technical improvements in architecture design, including a deep-fusion design, dual vision encoders, interleaved rotary position embeddings, and multi-modal classifier-free guidance.<n>Experiments show that Mogao achieves state-of-the-art performance in multi-modal understanding and text-to-image generation, but also excels in producing high-quality, coherent interleaved outputs.
arXiv Detail & Related papers (2025-05-08T17:58:57Z)
Personalized Image Generation with Deep Generative Models: A Decade Survey [51.26287478042516]
We present a review of generalized personalized image generation across various generative models. We first define a unified framework that standardizes the personalization process across different generative models. We then provide an in-depth analysis of personalization techniques within each generative model, highlighting their unique contributions and innovations.
arXiv Detail & Related papers (2025-02-18T17:34:04Z)
LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model [8.938617090786494]
We present LineArt, a framework that transfers complex appearance onto detailed design drawings. It generates high-fidelity appearance while preserving structural accuracy by simulating hierarchical visual cognition. It requires no precise 3D modeling, physical property specs, or network training, making it more convenient for design tasks.
arXiv Detail & Related papers (2024-12-16T07:54:45Z)
A Tiered GAN Approach for Monet-Style Image Generation [0.562479170374811]
This paper introduces a tiered GAN model to progressively refine image quality through a multi-stage process. The model transforms random noise into detailed artistic representations, addressing common challenges such as instability in training, mode collapse, and output quality.
arXiv Detail & Related papers (2024-12-07T19:10:29Z)
Diffusion Models with Anisotropic Gaussian Splatting for Image Inpainting [0.0]
We propose a novel inpainting method that combines diffusion models with anisotropic Gaussian splatting to capture both local structures and global context effectively. Our method outperforms state-of-the-art techniques, producing visually plausible results with enhanced structural integrity and texture realism.
arXiv Detail & Related papers (2024-12-02T16:29:06Z)
Training-Free Sketch-Guided Diffusion with Latent Optimization [22.94468603089249]
We propose an innovative training-free pipeline that extends existing text-to-image generation models to incorporate a sketch as an additional condition. To generate new images with a layout and structure closely resembling the input sketch, we find that these core features of a sketch can be tracked with the cross-attention maps of diffusion models. We introduce latent optimization, a method that refines the noisy latent at each intermediate step of the generation process.
arXiv Detail & Related papers (2024-08-31T00:44:03Z)
CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images. However, adapting these models for artistic image editing presents two significant challenges. We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z)
ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank [9.99530386586636]
Artistic style transfer aims to repaint the content image with the learned artistic style. Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches. We propose ArtBank, a novel artistic style transfer framework, to generate highly realistic stylized images.
arXiv Detail & Related papers (2023-12-11T05:53:40Z)
Collaborative Neural Painting [27.880814775833578]
We introduce a novel task, Collaborative Neural Painting (CNP), to facilitate collaborative art painting generation between humans and machines. CNP should produce a sequence of strokes supporting the completion of a coherent painting. We propose a painting representation based on a sequence of parametrized strokes, which makes it easy both editing and composition operations.
arXiv Detail & Related papers (2023-12-04T10:45:12Z)
Image Inpainting via Tractable Steering of Diffusion Models [54.13818673257381]
This paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior. Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs) We show that our approach can consistently improve the overall quality and semantic coherence of inpainted images with only 10% additional computational overhead.
arXiv Detail & Related papers (2023-11-28T21:14:02Z)
Advancing Urban Renewal: An Automated Approach to Generating Historical Arcade Facades with Stable Diffusion Models [1.645684081891833]
This study introduces a new methodology for automatically generating images of historical arcade facades. By classifying and tagging a variety of arcade styles, we have constructed several realistic arcade facade image datasets. Our approach has demonstrated high levels of precision, authenticity, and diversity in the generated images.
arXiv Detail & Related papers (2023-11-20T08:03:12Z)
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion [50.59261592343479]
We present Kandinsky1, a novel exploration of latent diffusion architecture. The proposed model is trained separately to map text embeddings to image embeddings of CLIP. We also deployed a user-friendly demo system that supports diverse generative modes such as text-to-image generation, image fusion, text and image fusion, image variations generation, and text-guided inpainting/outpainting.
arXiv Detail & Related papers (2023-10-05T12:29:41Z)
CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model [54.74470985388726]
controllable Chinese landscape painting generation method named CCLAP. Our method achieves state-of-the-art performance, especially in artfully-composed and artistic conception.
arXiv Detail & Related papers (2023-04-09T04:16:28Z)
QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity [94.5479418998225]
We propose a new style transfer framework called QuantArt for high visual-fidelity stylization. Our framework achieves significantly higher visual fidelity compared with the existing style transfer methods.
arXiv Detail & Related papers (2022-12-20T17:09:53Z)
Compositional Transformers for Scene Generation [13.633811200719627]
We introduce the GANformer2 model, an iterative object-oriented transformer, explored for the task of generative modeling. We show it achieves state-of-the-art performance in terms of visual quality, diversity and consistency. Further experiments demonstrate the model's disentanglement and provide a deeper insight into its generative process.
arXiv Detail & Related papers (2021-11-17T08:11:42Z)
Modeling Artistic Workflows for Image Generation and Editing [83.43047077223947]
We propose a generative model that follows a given artistic workflow. It enables both multi-stage image generation as well as multi-stage image editing of an existing piece of art.
arXiv Detail & Related papers (2020-07-14T17:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.