Factorized Diffusion: Perceptual Illusions by Noise Decomposition
- URL: http://arxiv.org/abs/2404.11615v1
- Date: Wed, 17 Apr 2024 17:59:59 GMT
- Title: Factorized Diffusion: Perceptual Illusions by Noise Decomposition
- Authors: Daniel Geng, Inbum Park, Andrew Owens,
- Abstract summary: We present a zero-shot method to control each individual component through diffusion model sampling.
For certain decompositions, our method recovers prior approaches to compositional generation and spatial control.
We show that we can extend our approach to generate hybrid images from real images.
- Score: 15.977340635967018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given a factorization of an image into a sum of linear components, we present a zero-shot method to control each individual component through diffusion model sampling. For example, we can decompose an image into low and high spatial frequencies and condition these components on different text prompts. This produces hybrid images, which change appearance depending on viewing distance. By decomposing an image into three frequency subbands, we can generate hybrid images with three prompts. We also use a decomposition into grayscale and color components to produce images whose appearance changes when they are viewed in grayscale, a phenomena that naturally occurs under dim lighting. And we explore a decomposition by a motion blur kernel, which produces images that change appearance under motion blurring. Our method works by denoising with a composite noise estimate, built from the components of noise estimates conditioned on different prompts. We also show that for certain decompositions, our method recovers prior approaches to compositional generation and spatial control. Finally, we show that we can extend our approach to generate hybrid images from real images. We do this by holding one component fixed and generating the remaining components, effectively solving an inverse problem.
Related papers
- Compositional Image Decomposition with Diffusion Models [70.07406583580591]
In this paper, we present a method to decompose an image into such compositional components.
Our approach, Decomp Diffusion, is an unsupervised method which infers a set of different components in the image.
We demonstrate how components can capture different factors of the scene, ranging from global scene descriptors like shadows or facial expression to local scene descriptors like constituent objects.
arXiv Detail & Related papers (2024-06-27T16:13:34Z) - AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation [99.57024606542416]
We propose an adaptive all-in-one image restoration network based on frequency mining and modulation.
Our approach is motivated by the observation that different degradation types impact the image content on different frequency subbands.
The proposed model achieves adaptive reconstruction by accentuating the informative frequency subbands according to different input degradations.
arXiv Detail & Related papers (2024-03-21T17:58:14Z) - Neural Spline Fields for Burst Image Fusion and Layer Separation [40.9442467471977]
We propose a versatile intermediate representation: a two-layer alpha-composited image plus flow model constructed with neural spline fields.
Our method is able to jointly fuse a burst image capture into one high-resolution reconstruction and decompose it into transmission and obstruction layers.
We find that, with no post-processing steps or learned priors, our generalizable model is able to outperform existing dedicated single-image and multi-view obstruction removal approaches.
arXiv Detail & Related papers (2023-12-21T18:54:19Z) - Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models [15.977340635967018]
Multi-view optical illusions are images that change appearance upon a transformation, such as a flip or rotation.
We propose a zero-shot method for obtaining these illusions from off-the-shelf text-to-image diffusion models.
We provide both qualitative and quantitative results demonstrating the effectiveness and flexibility of our method.
arXiv Detail & Related papers (2023-11-29T18:59:59Z) - Decomposer: Semi-supervised Learning of Image Restoration and Image
Decomposition [2.702990676892003]
We present a semi-supervised reconstruction model that decomposes distorted image sequences into their fundamental building blocks.
We use the SIDAR dataset that provides a large number of distorted image sequences.
Each distortion changes the original signal in different ways, e.g., additive or multiplicative noise.
arXiv Detail & Related papers (2023-11-28T14:48:22Z) - Diffusion Posterior Illumination for Ambiguity-aware Inverse Rendering [63.24476194987721]
Inverse rendering, the process of inferring scene properties from images, is a challenging inverse problem.
Most existing solutions incorporate priors into the inverse-rendering pipeline to encourage plausible solutions.
We propose a novel scheme that integrates a denoising probabilistic diffusion model pre-trained on natural illumination maps into an optimization framework.
arXiv Detail & Related papers (2023-09-30T12:39:28Z) - $PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D
Reconstruction [97.06927852165464]
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision.
We propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process.
arXiv Detail & Related papers (2023-02-21T13:37:07Z) - Blind Image Decomposition [53.760745569495825]
We present Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting.
How to decompose superimposed images, like rainy images, into distinct source components is a crucial step towards real-world vision systems.
We propose a simple yet general Blind Image Decomposition Network (BIDeN) to serve as a strong baseline for future work.
arXiv Detail & Related papers (2021-08-25T17:37:19Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - A Deep Decomposition Network for Image Processing: A Case Study for
Visible and Infrared Image Fusion [38.17268441062239]
We propose a new image decomposition method based on convolutional neural network.
We input infrared image and visible light image and decompose them into three high-frequency feature images and a low-frequency feature image respectively.
The two sets of feature images are fused using a specific fusion strategy to obtain fusion feature images.
arXiv Detail & Related papers (2021-02-21T06:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.