I$^2$SB: Image-to-Image Schr\"odinger Bridge
- URL: http://arxiv.org/abs/2302.05872v3
- Date: Fri, 26 May 2023 02:55:08 GMT
- Title: I$^2$SB: Image-to-Image Schr\"odinger Bridge
- Authors: Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A. Theodorou,
Weili Nie, Anima Anandkumar
- Abstract summary: Image-to-Image Schr"odinger Bridge (I$2$SB) is a new class of conditional diffusion models.
I$2$SB directly learns the nonlinear diffusion processes between two given distributions.
We show that I$2$SB surpasses standard conditional diffusion models with more interpretable generative processes.
- Score: 87.43524087956457
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Image-to-Image Schr\"odinger Bridge (I$^2$SB), a new class of
conditional diffusion models that directly learn the nonlinear diffusion
processes between two given distributions. These diffusion bridges are
particularly useful for image restoration, as the degraded images are
structurally informative priors for reconstructing the clean images. I$^2$SB
belongs to a tractable class of Schr\"odinger bridge, the nonlinear extension
to score-based models, whose marginal distributions can be computed
analytically given boundary pairs. This results in a simulation-free framework
for nonlinear diffusions, where the I$^2$SB training becomes scalable by
adopting practical techniques used in standard diffusion models. We validate
I$^2$SB in solving various image restoration tasks, including inpainting,
super-resolution, deblurring, and JPEG restoration on ImageNet 256x256 and show
that I$^2$SB surpasses standard conditional diffusion models with more
interpretable generative processes. Moreover, I$^2$SB matches the performance
of inverse methods that additionally require the knowledge of the corruption
operators. Our work opens up new algorithmic opportunities for developing
efficient nonlinear diffusion models on a large scale. scale. Project page and
codes: https://i2sb.github.io/
Related papers
- Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow [65.51671121528858]
Diffusion models have greatly improved visual generation but are hindered by slow generation speed due to the computationally intensive nature of solving generative ODEs.
Rectified flow, a widely recognized solution, improves generation speed by straightening the ODE path.
We propose Rectified Diffusion, which generalizes the design space and application scope of rectification to encompass the broader category of diffusion models.
arXiv Detail & Related papers (2024-10-09T17:43:38Z) - A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler.
We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance.
Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z) - FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - Implicit Image-to-Image Schrodinger Bridge for Image Restoration [13.138398298354113]
The Image-to-Image Schr"odinger Bridge (I$2$SB) presents a promising alternative by starting the generative process from corrupted images.
We introduce the Implicit Image-to-Image Schr"odinger Bridge (I$3$SB) to further accelerate the generative process of I$2$SB.
arXiv Detail & Related papers (2024-03-10T03:22:57Z) - Cross-view Masked Diffusion Transformers for Person Image Synthesis [21.242398582282522]
We present X-MDPT, a novel diffusion model designed for pose-guided human image generation.
X-MDPT distinguishes itself by employing masked diffusion transformers that operate on latent patches.
Our model outperforms state-of-the-art approaches on the DeepFashion dataset.
arXiv Detail & Related papers (2024-02-02T15:57:13Z) - Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes.
For many applications such as image editing, the model input comes from a distribution that is not random noise.
In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z) - Unpaired Image-to-Image Translation via Neural Schr\"odinger Bridge [70.79973551604539]
We propose Unpaired Neural Schr"odinger Bridge (UNSB), which expresses the SB problem as a sequence of adversarial learning problems.
UNSB is scalable and successfully solves various unpaired I2I translation tasks.
arXiv Detail & Related papers (2023-05-24T12:05:24Z) - SDM: Spatial Diffusion Model for Large Hole Image Inpainting [106.90795513361498]
We present a novel spatial diffusion model (SDM) that uses a few iterations to gradually deliver informative pixels to the entire image.
Also, thanks to the proposed decoupled probabilistic modeling and spatial diffusion scheme, our method achieves high-quality large-hole completion.
arXiv Detail & Related papers (2022-12-06T13:30:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.