Related papers: A Tiered GAN Approach for Monet-Style Image Generation

A Tiered GAN Approach for Monet-Style Image Generation

URL: http://arxiv.org/abs/2412.05724v1
Date: Sat, 07 Dec 2024 19:10:29 GMT
Title: A Tiered GAN Approach for Monet-Style Image Generation
Authors: FNU Neha, Deepshikha Bhati, Deepak Kumar Shukla, Md Amiruzzaman,
Abstract summary: This paper introduces a tiered GAN model to progressively refine image quality through a multi-stage process.<n>The model transforms random noise into detailed artistic representations, addressing common challenges such as instability in training, mode collapse, and output quality.
Score: 0.562479170374811
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative Adversarial Networks (GANs) have proven to be a powerful tool in generating artistic images, capable of mimicking the styles of renowned painters, such as Claude Monet. This paper introduces a tiered GAN model to progressively refine image quality through a multi-stage process, enhancing the generated images at each step. The model transforms random noise into detailed artistic representations, addressing common challenges such as instability in training, mode collapse, and output quality. This approach combines downsampling and convolutional techniques, enabling the generation of high-quality Monet-style artwork while optimizing computational efficiency. Experimental results demonstrate the architecture's ability to produce foundational artistic structures, though further refinements are necessary for achieving higher levels of realism and fidelity to Monet's style. Future work focuses on improving training methodologies and model complexity to bridge the gap between generated and true artistic images. Additionally, the limitations of traditional GANs in artistic generation are analyzed, and strategies to overcome these shortcomings are proposed.

Related papers

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset [140.1967962502411]
We introduce a novel approach that employs a diffusion transformer to generate semantically rich CLIP image features.<n>A sequential pretraining strategy for unified models-first training on image understanding and subsequently on image generation offers practical advantages.<n>Building on our innovative model design, training recipe, and datasets, we develop BLIP3-o, a suite of state-of-the-art unified multimodal models.
arXiv Detail & Related papers (2025-05-14T17:11:07Z)
Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation [54.588082888166504]
We present Mogao, a unified framework that enables interleaved multi-modal generation through a causal approach.<n>Mogoo integrates a set of key technical improvements in architecture design, including a deep-fusion design, dual vision encoders, interleaved rotary position embeddings, and multi-modal classifier-free guidance.<n>Experiments show that Mogao achieves state-of-the-art performance in multi-modal understanding and text-to-image generation, but also excels in producing high-quality, coherent interleaved outputs.
arXiv Detail & Related papers (2025-05-08T17:58:57Z)
Boosting Generative Image Modeling via Joint Image-Feature Synthesis [10.32324138962724]
We introduce a novel generative image modeling framework that seamlessly bridges the gap by leveraging a diffusion model to jointly model low-level image latents. Our latent-semantic diffusion approach learns to generate coherent image-feature pairs from pure noise. By eliminating the need for complex distillation objectives, our unified design simplifies training and unlocks a powerful new inference strategy: Representation Guidance.
arXiv Detail & Related papers (2025-04-22T17:41:42Z)
Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis [2.205829309604458]
LPGen is a novel diffusion-based model specifically designed for landscape painting generation. LPGen introduces a decoupled cross-attention mechanism that independently processes structural and stylistic features. The model is pre-trained on a curated dataset of high-resolution landscape images, categorized by distinct artistic styles, and then fine-tuned to ensure detailed and consistent output.
arXiv Detail & Related papers (2024-07-24T12:32:24Z)
Colour and Brush Stroke Pattern Recognition in Abstract Art using Modified Deep Convolutional Generative Adversarial Networks [0.0]
This paper describes the study of a wide distribution of abstract paintings using Generative Adrial Neural Networks(GAN)<n>The challenge lies in developing an efficient GAN architecture that overcomes common training pitfalls.<n>This paper introduces a modified-DCGAN (mDCGAN) specifically designed for high-quality artwork generation.
arXiv Detail & Related papers (2024-03-27T09:35:56Z)
Active Generation for Image Classification [45.93535669217115]
We propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model. With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation.
arXiv Detail & Related papers (2024-03-11T08:45:31Z)
TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion [0.0]
A two-stage method that combines controllability and high quality in the generation of images is proposed. By separating controllability from high quality, This method achieves outstanding results.
arXiv Detail & Related papers (2024-03-02T13:59:02Z)
CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images. However, adapting these models for artistic image editing presents two significant challenges. We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z)
Image Inpainting via Tractable Steering of Diffusion Models [54.13818673257381]
This paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior. Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs) We show that our approach can consistently improve the overall quality and semantic coherence of inpainted images with only 10% additional computational overhead.
arXiv Detail & Related papers (2023-11-28T21:14:02Z)
CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation [49.3016007471979]
Large generative diffusion models have revolutionized text-to-image generation and offer immense potential for conditional generation tasks. However, their widespread adoption is hindered by the high computational cost, which limits their real-time application. We introduce a novel method dubbed CoDi, that adapts a pre-trained latent diffusion model to accept additional image conditioning inputs.
arXiv Detail & Related papers (2023-10-02T17:59:18Z)
QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity [94.5479418998225]
We propose a new style transfer framework called QuantArt for high visual-fidelity stylization. Our framework achieves significantly higher visual fidelity compared with the existing style transfer methods.
arXiv Detail & Related papers (2022-12-20T17:09:53Z)
DC-Art-GAN: Stable Procedural Content Generation using DC-GANs for Digital Art [4.9631159466100305]
We advocate the concept of using deep generative networks with adversarial training for a stable and variant art generation. The work mainly focuses on using the Deep Convolutional Generative Adversarial Network (DC-GAN) and explores the techniques to address the common pitfalls in GAN training.
arXiv Detail & Related papers (2022-09-06T23:06:46Z)
InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model. This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z)
A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly. Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z)
Modeling Artistic Workflows for Image Generation and Editing [83.43047077223947]
We propose a generative model that follows a given artistic workflow. It enables both multi-stage image generation as well as multi-stage image editing of an existing piece of art.
arXiv Detail & Related papers (2020-07-14T17:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.