Related papers: Implicit Image-to-Image Schrodinger Bridge for Image Restoration

Implicit Image-to-Image Schrodinger Bridge for Image Restoration

URL: http://arxiv.org/abs/2403.06069v3
Date: Sat, 22 Mar 2025 03:07:11 GMT
Title: Implicit Image-to-Image Schrodinger Bridge for Image Restoration
Authors: Yuang Wang, Siyeop Yoon, Pengfei Jin, Matthew Tivnan, Sifan Song, Zhennong Chen, Rui Hu, Li Zhang, Quanzheng Li, Zhiqiang Chen, Dufan Wu,
Abstract summary: We introduce the Implicit Image-to-Image Schr"odinger Bridge (I$3$SB) to further accelerate the generative process of I$2$SB.<n>I$3$SB restructures the generative process into a non-Markovian framework by incorporating the initial corrupted image at each generative step.<n>Compared to I$2$SB, I$3$SB achieves the same perceptual quality with fewer generative steps, while maintaining or improving fidelity to the ground truth.
Score: 13.138398298354113
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Diffusion-based models have demonstrated remarkable effectiveness in image restoration tasks; however, their iterative denoising process, which starts from Gaussian noise, often leads to slow inference speeds. The Image-to-Image Schr\"odinger Bridge (I$^2$SB) offers a promising alternative by initializing the generative process from corrupted images while leveraging training techniques from score-based diffusion models. In this paper, we introduce the Implicit Image-to-Image Schr\"odinger Bridge (I$^3$SB) to further accelerate the generative process of I$^2$SB. I$^3$SB restructures the generative process into a non-Markovian framework by incorporating the initial corrupted image at each generative step, effectively preserving and utilizing its information. To enable direct use of pretrained I$^2$SB models without additional training, we ensure consistency in marginal distributions. Extensive experiments across many image corruptions, including noise, low resolution, JPEG compression, and sparse sampling, and multiple image modalities, such as natural, human face, and medical images, demonstrate the acceleration benefits of I$^3$SB. Compared to I$^2$SB, I$^3$SB achieves the same perceptual quality with fewer generative steps, while maintaining or improving fidelity to the ground truth.

Related papers

Representation Entanglement for Generation:Training Diffusion Transformers Is Much Easier Than You Think [56.539823627694304]
REPA and its variants effectively mitigate training challenges in diffusion models by incorporating external visual representations from pretrained models.<n>We argue that the external alignment, which is absent during the entire denoising inference process, falls short of fully harnessing the potential of discriminative representations.<n>We propose Representation Entanglement for Generation (REG), which entangles low-level image latents with a single high-level class token from pretrained foundation models for denoising.
arXiv Detail & Related papers (2025-07-02T08:29:18Z)
IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models [43.84154970740943]
Bridge models in image restoration construct a diffusion process from degraded to clear images.<n>Existing methods typically require training a bridge model from scratch for each specific type of degradation.<n>We introduce the IRBridge framework, which enables the direct utilization of generative models within image restoration bridges.
arXiv Detail & Related papers (2025-05-30T09:45:41Z)
Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards [52.90573877727541]
reinforcement learning (RL) has been considered for diffusion model fine-tuning.<n>RL's effectiveness is limited by the challenge of sparse reward.<n>$textB2text-DiffuRL$ is compatible with existing optimization algorithms.
arXiv Detail & Related papers (2025-03-14T09:45:19Z)
An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models [13.00429687431982]
Diffusion bridge models initialize the generative process from corrupted images instead of pure Gaussian noise. Existing diffusion bridge models often rely on Differential Equation samplers, which result in slower inference speed. We propose a high-order ODE sampler with a start for diffusion bridge models. Our method is fully compatible with pretrained diffusion bridge models and requires no additional training.
arXiv Detail & Related papers (2024-12-28T03:32:26Z)
Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
Diffusion models have dominated the field of large, generative image models. We propose an algorithm for fast-constrained sampling in large pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-24T14:52:38Z)
Blind Image Restoration via Fast Diffusion Inversion [17.139433082780037]
Blind Image Restoration via fast Diffusion (BIRD) is a blind IR method that jointly optimize for the degradation model parameters and the restored image. A key idea in our method is not to modify the reverse sampling, i.e., not to alter all the intermediate latents, once an initial noise is sampled. We experimentally validate BIRD on several image restoration tasks and show that it achieves state of the art performance on all of them.
arXiv Detail & Related papers (2024-05-29T23:38:12Z)
SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation [1.5892730797514436]
Text-to-image diffusion models often suffer from slow iterative sampling processes. We present a novel image-free distillation scheme named $textbfSwiftBrush$. SwiftBrush achieves an FID score of $textbf16.67$ and a CLIP score of $textbf0.29$ on the COCO-30K benchmark.
arXiv Detail & Related papers (2023-12-08T18:44:09Z)
Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion [41.758635460235716]
We introduce Second-order Tweedie sampler from Surrogate Loss (STSL) STSL offers efficiency comparable to first-order Tweedie with a tractable reverse process using second-order approximation. Our method surpasses SoTA solvers PSLD and P2L, achieving 4X and 8X reduction in neural function evaluations.
arXiv Detail & Related papers (2023-12-01T14:36:24Z)
ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models [59.90959789767886]
We show that optimizing consistency training loss minimizes the Wasserstein distance between target and generated distributions. By incorporating a discriminator into the consistency training framework, our method achieves improved FID scores on CIFAR10 and ImageNet 64$times$64 and LSUN Cat 256$times$256 datasets.
arXiv Detail & Related papers (2023-11-23T16:49:06Z)
Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise [15.702941058218196]
This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes. The first contribution involves re parameterizing the diffusion process in terms of the angle on a quarter-circular arc between the image and noise. The second contribution is to directly estimate both the image ($mathbfx_0$) and noise ($mathbfepsilon$) using our network.
arXiv Detail & Related papers (2023-10-26T05:43:07Z)
Simultaneous Image-to-Zero and Zero-to-Noise: Diffusion Models with Analytical Image Attenuation [53.04220377034574]
We propose incorporating an analytical image attenuation process into the forward diffusion process for high-quality (un)conditioned image generation. Our method represents the forward image-to-noise mapping as simultaneous textitimage-to-zero mapping and textitzero-to-noise mapping. We have conducted experiments on unconditioned image generation, textite.g., CIFAR-10 and CelebA-HQ-256, and image-conditioned downstream tasks such as super-resolution, saliency detection, edge detection, and image inpainting.
arXiv Detail & Related papers (2023-06-23T18:08:00Z)
Image generation with shortest path diffusion [10.041144269046693]
We show that the Shortest Path Diffusion (SPD) determines the entire structure of the corruption. We show that SPD improves on strong baselines without any hypertemporal tuning and outperforms all previous Diffusion Models based on image blurring. Our work sheds new light on made observations in recent works and provides a new approach to improve diffusion models on images and other types of data.
arXiv Detail & Related papers (2023-06-01T09:53:35Z)
EGC: Image Generation and Classification via a Diffusion Energy-Based Model [59.591755258395594]
This work introduces an energy-based classifier and generator, namely EGC, which can achieve superior performance in both tasks using a single neural network. EGC achieves competitive generation results compared with state-of-the-art approaches on ImageNet-1k, CelebA-HQ and LSUN Church. This work represents the first successful attempt to simultaneously excel in both tasks using a single set of network parameters.
arXiv Detail & Related papers (2023-04-04T17:59:14Z)
I$^2$SB: Image-to-Image Schr\"odinger Bridge [87.43524087956457]
Image-to-Image Schr"odinger Bridge (I$2$SB) is a new class of conditional diffusion models. I$2$SB directly learns the nonlinear diffusion processes between two given distributions. We show that I$2$SB surpasses standard conditional diffusion models with more interpretable generative processes.
arXiv Detail & Related papers (2023-02-12T08:35:39Z)
Dynamic Dual-Output Diffusion Models [100.32273175423146]
Iterative denoising-based generation has been shown to be comparable in quality to other classes of generative models. A major drawback of this method is that it requires hundreds of iterations to produce a competitive result. Recent works have proposed solutions that allow for faster generation with fewer iterations, but the image quality gradually deteriorates.
arXiv Detail & Related papers (2022-03-08T11:20:40Z)
Unsupervised Single Image Super-resolution Under Complex Noise [60.566471567837574]
This paper proposes a model-based unsupervised SISR method to deal with the general SISR task with unknown degradations. The proposed method can evidently surpass the current state of the art (SotA) method (about 1dB PSNR) not only with a slighter model (0.34M vs. 2.40M) but also faster speed.
arXiv Detail & Related papers (2021-07-02T11:55:40Z)
Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision. We propose a novel blind image restoration method, aiming to integrate both the advantages of them. Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.