Making Images from Images: Interleaving Denoising and Transformation
- URL: http://arxiv.org/abs/2411.15925v1
- Date: Sun, 24 Nov 2024 17:13:11 GMT
- Title: Making Images from Images: Interleaving Denoising and Transformation
- Authors: Shumeet Baluja, David Marwood, Ashwin Baluja,
- Abstract summary: We learn not only the content of the images, but also the parameterized transformations required to transform the desired images into each other.
By learning the image transforms, we allow any source image to be pre-specified.
Unlike previous methods, increasing the number of regions actually makes the problem easier and improves results.
- Score: 5.776000002820102
- License:
- Abstract: Simply by rearranging the regions of an image, we can create a new image of any subject matter. The definition of regions is user definable, ranging from regularly and irregularly-shaped blocks, concentric rings, or even individual pixels. Our method extends and improves recent work in the generation of optical illusions by simultaneously learning not only the content of the images, but also the parameterized transformations required to transform the desired images into each other. By learning the image transforms, we allow any source image to be pre-specified; any existing image (e.g. the Mona Lisa) can be transformed to a novel subject. We formulate this process as a constrained optimization problem and address it through interleaving the steps of image diffusion with an energy minimization step. Unlike previous methods, increasing the number of regions actually makes the problem easier and improves results. We demonstrate our approach in both pixel and latent spaces. Creative extensions, such as using infinite copies of the source image and employing multiple source images, are also given.
Related papers
- Move Anything with Layered Scene Diffusion [77.45870343845492]
We propose SceneDiffusion to optimize a layered scene representation during the diffusion sampling process.
Our key insight is that spatial disentanglement can be obtained by jointly denoising scene renderings at different spatial layouts.
Our generated scenes support a wide range of spatial editing operations, including moving, resizing, cloning, and layer-wise appearance editing operations.
arXiv Detail & Related papers (2024-04-10T17:28:16Z) - FlexIT: Towards Flexible Semantic Image Translation [59.09398209706869]
We propose FlexIT, a novel method which can take any input image and a user-defined text instruction for editing.
First, FlexIT combines the input image and text into a single target point in the CLIP multimodal embedding space.
We iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.
arXiv Detail & Related papers (2022-03-09T13:34:38Z) - Barbershop: GAN-based Image Compositing using Segmentation Masks [40.85660781133709]
We present a novel solution to image blending, particularly for the problem of hairstyle transfer, based on GAN-inversion.
Our results demonstrate a significant improvement over the current state of the art in a user study, with users preferring our blending solution over 95 percent of the time.
arXiv Detail & Related papers (2021-06-02T23:20:43Z) - In&Out : Diverse Image Outpainting via GAN Inversion [89.84841983778672]
Image outpainting seeks for a semantically consistent extension of the input image beyond its available content.
In this work, we formulate the problem from the perspective of inverting generative adversarial networks.
Our generator renders micro-patches conditioned on their joint latent code as well as their individual positions in the image.
arXiv Detail & Related papers (2021-04-01T17:59:10Z) - TransFill: Reference-guided Image Inpainting by Merging Multiple Color
and Spatial Transformations [35.9576572490994]
We propose TransFill, a multi-homography transformed fusion method to fill the hole by referring to another source image that shares scene contents with the target image.
We learn to adjust the color and apply a pixel-level warping to each homography-warped source image to make it more consistent with the target.
Our method achieves state-of-the-art performance on pairs of images across a variety of wide baselines and color differences, and generalizes to user-provided image pairs.
arXiv Detail & Related papers (2021-03-29T22:45:07Z) - Unsupervised Image Transformation Learning via Generative Adversarial
Networks [40.84518581293321]
We study the image transformation problem by learning the underlying transformations from a collection of images using Generative Adversarial Networks (GANs)
We propose an unsupervised learning framework, termed as TrGAN, to project images onto a transformation space that is shared by the generator and the discriminator.
arXiv Detail & Related papers (2021-03-13T17:08:19Z) - Powers of layers for image-to-image translation [60.5529622990682]
We propose a simple architecture to address unpaired image-to-image translation tasks.
We start from an image autoencoder architecture with fixed weights.
For each task we learn a residual block operating in the latent space, which is iteratively called until the target domain is reached.
arXiv Detail & Related papers (2020-08-13T09:02:17Z) - Semantic Image Manipulation Using Scene Graphs [105.03614132953285]
We introduce a-semantic scene graph network that does not require direct supervision for constellation changes or image edits.
This makes possible to train the system from existing real-world datasets with no additional annotation effort.
arXiv Detail & Related papers (2020-04-07T20:02:49Z) - Exploiting Deep Generative Prior for Versatile Image Restoration and
Manipulation [181.08127307338654]
This work presents an effective way to exploit the image prior captured by a generative adversarial network (GAN) trained on large-scale natural images.
The deep generative prior (DGP) provides compelling results to restore missing semantics, e.g., color, patch, resolution, of various degraded images.
arXiv Detail & Related papers (2020-03-30T17:45:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.