SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Generation
- URL: http://arxiv.org/abs/2404.15743v2
- Date: Sun, 05 Jan 2025 07:27:15 GMT
- Title: SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Generation
- Authors: Xiang Gao, Yuqi Zhang,
- Abstract summary: This paper handles the problem of translating real pictures into traditional Chinese ink-wash paintings.<n>We propose to incorporate saliency detection into the unpaired I2I framework to regularize image content.
- Score: 17.238908596339904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent style transfer problems are still largely dominated by Generative Adversarial Network (GAN) from the perspective of cross-domain image-to-image (I2I) translation, where the pivotal issue is to learn and transfer target-domain style patterns onto source-domain content images. This paper handles the problem of translating real pictures into traditional Chinese ink-wash paintings, i.e., Chinese ink-wash painting style transfer. Though a wide range of I2I models tackle this problem, a notable challenge is that the content details of the source image could be easily erased or corrupted due to the transfer of ink-wash style elements. To remedy this issue, we propose to incorporate saliency detection into the unpaired I2I framework to regularize image content, where the detected saliency map is utilized from two aspects: (\romannumeral1) we propose saliency IOU (SIOU) loss to explicitly regularize object content structure by enforcing saliency consistency before and after image stylization; (\romannumeral2) we propose saliency adaptive normalization (SANorm) which implicitly enhances object structure integrity of the generated paintings by dynamically injecting image saliency information into the generator to guide stylization process. Besides, we also propose saliency attended discriminator which harnesses image saliency information to focus generative adversarial attention onto the drawn objects, contributing to generating more vivid and delicate brush strokes and ink-wash textures. Extensive qualitative and quantitative experiments demonstrate superiority of our approach over related advanced image stylization methods in both GAN and diffusion model paradigms.
Related papers
- Free-Lunch Color-Texture Disentanglement for Stylized Image Generation [58.406368812760256]
This paper introduces the first tuning-free approach to achieve free-lunch color-texture disentanglement in stylized T2I generation.
We develop techniques for separating and extracting Color-Texture Embeddings (CTE) from individual color and texture reference images.
To ensure that the color palette of the generated image aligns closely with the color reference, we apply a whitening and coloring transformation.
arXiv Detail & Related papers (2025-03-18T14:10:43Z) - Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator [44.620847608977776]
Diptych Prompting is a novel zero-shot approach that reinterprets as an inpainting task with precise subject alignment.
Our method supports not only subject-driven generation but also stylized image generation and subject-driven image editing.
arXiv Detail & Related papers (2024-11-23T06:17:43Z) - Improving Text-guided Object Inpainting with Semantic Pre-inpainting [95.17396565347936]
We decompose the typical single-stage object inpainting into two cascaded processes: semantic pre-inpainting and high-fieldity object generation.
To achieve this, we cascade a Transformer-based semantic inpainter and an object inpainting diffusion model, leading to a novel CAscaded Transformer-Diffusion framework.
arXiv Detail & Related papers (2024-09-12T17:55:37Z) - DAFT-GAN: Dual Affine Transformation Generative Adversarial Network for Text-Guided Image Inpainting [2.656795553429629]
We propose a dual affine transformation generative adversarial network (DAFT-GAN) to maintain the semantic consistency for text-guided inpainting.
Our proposed model outperforms the existing GAN-based models in both qualitative and quantitative assessments.
arXiv Detail & Related papers (2024-08-09T09:28:42Z) - Sketch-guided Image Inpainting with Partial Discrete Diffusion Process [5.005162730122933]
We introduce a novel partial discrete diffusion process (PDDP) for sketch-guided inpainting.
PDDP corrupts the masked regions of the image and reconstructs these masked regions conditioned on hand-drawn sketches.
The proposed novel transformer module accepts two inputs -- the image containing the masked region to be inpainted and the query sketch to model the reverse diffusion process.
arXiv Detail & Related papers (2024-04-18T07:07:38Z) - BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed
Dual-Branch Diffusion [61.90969199199739]
BrushNet is a novel plug-and-play dual-branch model engineered to embed pixel-level masked image features into any pre-trained DM.
BrushNet's superior performance over existing models across seven key metrics, including image quality, mask region preservation, and textual coherence.
arXiv Detail & Related papers (2024-03-11T17:59:31Z) - DLP-GAN: learning to draw modern Chinese landscape photos with
generative adversarial network [20.74857981451259]
Chinese landscape painting has a unique and artistic style, and its drawing technique is highly abstract in both the use of color and the realistic representation of objects.
Previous methods focus on transferring from modern photos to ancient ink paintings, but little attention has been paid to translating landscape paintings into modern photos.
arXiv Detail & Related papers (2024-03-06T04:46:03Z) - Decoupled Textual Embeddings for Customized Image Generation [62.98933630971543]
Customized text-to-image generation aims to learn user-specified concepts with a few images.
Existing methods usually suffer from overfitting issues and entangle the subject-unrelated information with the learned concept.
We propose the DETEX, a novel approach that learns the disentangled concept embedding for flexible customized text-to-image generation.
arXiv Detail & Related papers (2023-12-19T03:32:10Z) - DreamInpainter: Text-Guided Subject-Driven Image Inpainting with
Diffusion Models [37.133727797607676]
This study introduces Text-Guided Subject-Driven Image Inpainting.
We compute dense subject features to ensure accurate subject replication.
We employ a discriminative token selection module to eliminate redundant subject details.
arXiv Detail & Related papers (2023-12-05T22:23:19Z) - Portrait Diffusion: Training-free Face Stylization with
Chain-of-Painting [64.43760427752532]
Face stylization refers to the transformation of a face into a specific portrait style.
Current methods require the use of example-based adaptation approaches to fine-tune pre-trained generative models.
This paper proposes a training-free face stylization framework, named Portrait Diffusion.
arXiv Detail & Related papers (2023-12-03T06:48:35Z) - T2IW: Joint Text to Image & Watermark Generation [74.20148555503127]
We introduce a novel task for the joint generation of text to image and watermark (T2IW)
This T2IW scheme ensures minimal damage to image quality when generating a compound image by forcing the semantic feature and the watermark signal to be compatible in pixels.
We demonstrate remarkable achievements in image quality, watermark invisibility, and watermark robustness, supported by our proposed set of evaluation metrics.
arXiv Detail & Related papers (2023-09-07T16:12:06Z) - LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image
Generation [121.45667242282721]
We propose a coarse-to-fine paradigm to achieve layout planning and image generation.
Our proposed method outperforms the state-of-the-art models in terms of photorealistic layout and image generation.
arXiv Detail & Related papers (2023-08-09T17:45:04Z) - StyleStegan: Leak-free Style Transfer Based on Feature Steganography [19.153040728118285]
existing style transfer methods suffer from a serious content leakage issue.
We propose a leak-free style transfer method based on feature steganography.
The results demonstrate that StyleStegan successfully mitigates the content leakage issue in serial and reversible style transfer tasks.
arXiv Detail & Related papers (2023-07-01T05:00:19Z) - StyO: Stylize Your Face in Only One-shot [11.715601955568536]
This paper focuses on face stylization with a single artistic target.
Existing works for this task often fail to retain the source content while achieving geometry variation.
We present a novel StyO model, ie. Stylize the face in only One-shot, to solve the above problem.
arXiv Detail & Related papers (2023-03-06T15:48:33Z) - Zero-shot Image-to-Image Translation [57.46189236379433]
We propose pix2pix-zero, an image-to-image translation method that can preserve the original image without manual prompting.
We propose cross-attention guidance, which aims to retain the cross-attention maps of the input image throughout the diffusion process.
Our method does not need additional training for these edits and can directly use the existing text-to-image diffusion model.
arXiv Detail & Related papers (2023-02-06T18:59:51Z) - Multi-Modality Image Inpainting using Generative Adversarial Networks [0.0]
We propose a model to address the problem of combining the image inpainting task with the multi-modality image-to-image translation.
The model will be evaluated on combined night-to-day image translation and inpainting, along with promising qualitative and quantitative results.
arXiv Detail & Related papers (2022-06-18T14:06:14Z) - In&Out : Diverse Image Outpainting via GAN Inversion [89.84841983778672]
Image outpainting seeks for a semantically consistent extension of the input image beyond its available content.
In this work, we formulate the problem from the perspective of inverting generative adversarial networks.
Our generator renders micro-patches conditioned on their joint latent code as well as their individual positions in the image.
arXiv Detail & Related papers (2021-04-01T17:59:10Z) - Towards Unsupervised Deep Image Enhancement with Generative Adversarial
Network [92.01145655155374]
We present an unsupervised image enhancement generative network (UEGAN)
It learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner.
Results show that the proposed model effectively improves the aesthetic quality of images.
arXiv Detail & Related papers (2020-12-30T03:22:46Z) - Two-Stream Appearance Transfer Network for Person Image Generation [16.681839931864886]
generative adversarial networks (GANs) widely used for image generation and translation rely on spatially local and translation equivariant operators.
This paper introduces a novel two-stream appearance transfer network (2s-ATN) to address this challenge.
It is a multi-stage architecture consisting of a source stream and a target stream. Each stage features an appearance transfer module and several two-stream feature fusion modules.
arXiv Detail & Related papers (2020-11-09T04:21:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.