CIMGEN: Controlled Image Manipulation by Finetuning Pretrained
Generative Models on Limited Data
- URL: http://arxiv.org/abs/2401.13006v1
- Date: Tue, 23 Jan 2024 06:30:47 GMT
- Title: CIMGEN: Controlled Image Manipulation by Finetuning Pretrained
Generative Models on Limited Data
- Authors: Chandrakanth Gudavalli, Erik Rosten, Lakshmanan Nataraj, Shivkumar
Chandrasekaran, B. S. Manjunath
- Abstract summary: A semantic map has information of objects present in the image.
One can easily modify the map to selectively insert, remove, or replace objects in the map.
The method proposed in this paper takes in the modified semantic map and alter the original image in accordance to the modified map.
- Score: 14.469539513542584
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Content creation and image editing can benefit from flexible user controls. A
common intermediate representation for conditional image generation is a
semantic map, that has information of objects present in the image. When
compared to raw RGB pixels, the modification of semantic map is much easier.
One can take a semantic map and easily modify the map to selectively insert,
remove, or replace objects in the map. The method proposed in this paper takes
in the modified semantic map and alter the original image in accordance to the
modified map. The method leverages traditional pre-trained image-to-image
translation GANs, such as CycleGAN or Pix2Pix GAN, that are fine-tuned on a
limited dataset of reference images associated with the semantic maps. We
discuss the qualitative and quantitative performance of our technique to
illustrate its capacity and possible applications in the fields of image
forgery and image editing. We also demonstrate the effectiveness of the
proposed image forgery technique in thwarting the numerous deep learning-based
image forensic techniques, highlighting the urgent need to develop robust and
generalizable image forensic tools in the fight against the spread of fake
media.
Related papers
- Towards Understanding Cross and Self-Attention in Stable Diffusion for
Text-Guided Image Editing [47.71851180196975]
tuning-free Text-guided Image Editing (TIE) is of greater importance for application developers.
We conduct an in-depth probing analysis and demonstrate that cross-attention maps in Stable Diffusion often contain object attribution information.
In contrast, self-attention maps play a crucial role in preserving the geometric and shape details of the source image.
arXiv Detail & Related papers (2024-03-06T03:32:56Z) - iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing.
It generates images conditioned on a source image and a textual edit prompt.
It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z) - DiffEdit: Diffusion-based semantic image editing with mask guidance [64.555930158319]
DiffEdit is a method to take advantage of text-conditioned diffusion models for the task of semantic image editing.
Our main contribution is able to automatically generate a mask highlighting regions of the input image that need to be edited.
arXiv Detail & Related papers (2022-10-20T17:16:37Z) - Image Shape Manipulation from a Single Augmented Training Sample [26.342929563689218]
DeepSIM is a generative model for conditional image manipulation based on a single image.
Our network learns to map between a primitive representation of the image to the image itself.
arXiv Detail & Related papers (2021-09-13T17:44:04Z) - Text as Neural Operator: Image Manipulation by Text Instruction [68.53181621741632]
In this paper, we study a setting that allows users to edit an image with multiple objects using complex text instructions to add, remove, or change the objects.
The inputs of the task are multimodal including (1) a reference image and (2) an instruction in natural language that describes desired modifications to the image.
We show that the proposed model performs favorably against recent strong baselines on three public datasets.
arXiv Detail & Related papers (2020-08-11T07:07:10Z) - Controllable Image Synthesis via SegVAE [89.04391680233493]
A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
arXiv Detail & Related papers (2020-07-16T15:18:53Z) - Image Shape Manipulation from a Single Augmented Training Sample [24.373900721120286]
DeepSIM is a generative model for conditional image manipulation based on a single image.
Our network learns to map between a primitive representation of the image to the image itself.
arXiv Detail & Related papers (2020-07-02T17:55:27Z) - Semantic Image Manipulation Using Scene Graphs [105.03614132953285]
We introduce a-semantic scene graph network that does not require direct supervision for constellation changes or image edits.
This makes possible to train the system from existing real-world datasets with no additional annotation effort.
arXiv Detail & Related papers (2020-04-07T20:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.