Bidirectional Consistency Models
- URL: http://arxiv.org/abs/2403.18035v3
- Date: Mon, 30 Sep 2024 11:18:38 GMT
- Title: Bidirectional Consistency Models
- Authors: Liangchen Li, Jiajun He,
- Abstract summary: Diffusion models (DMs) generate high-quality samples by iteratively denoising a random vector.
DMs can invert an input image to noise by moving backward along the probability flow ordinary differential equation (PF ODE)
- Score: 1.486435467709869
- License:
- Abstract: Diffusion models (DMs) are capable of generating remarkably high-quality samples by iteratively denoising a random vector, a process that corresponds to moving along the probability flow ordinary differential equation (PF ODE). Interestingly, DMs can also invert an input image to noise by moving backward along the PF ODE, a key operation for downstream tasks such as interpolation and image editing. However, the iterative nature of this process restricts its speed, hindering its broader application. Recently, Consistency Models (CMs) have emerged to address this challenge by approximating the integral of the PF ODE, largely reducing the number of iterations. Yet, the absence of an explicit ODE solver complicates the inversion process. To resolve this, we introduce Bidirectional Consistency Model (BCM), which learns a single neural network that enables both forward and backward traversal along the PF ODE, efficiently unifying generation and inversion tasks within one framework. We can train BCM from scratch or tune it using a pretrained consistency model, wh ich reduces the training cost and increases scalability. We demonstrate that BCM enables one-step generation and inversion while also allowing the use of additional steps to enhance generation quality or reduce reconstruction error. We further showcase BCM's capability in downstream tasks, such as interpolation, inpainting, and blind restoration of compressed images. Notably, when the number of function evaluations (NFE) is constrained, BCM surpasses domain-specific restoration methods, such as I$^2$SB and Palette, in a fully zero-shot manner, offering an efficient alternative for inversion problems. Our code and weights are available at https://github.com/Mosasaur5526/BCM-iCT-torch.
Related papers
- Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation [58.19676004192321]
Diffusion models (DMs), which enable both image generation from noise and inversion from data, have inspired powerful unpaired image-to-image (I2I) translation algorithms.
We tackle this problem with Schrodinger Bridges (SBs), which are differential equations (SDEs) between distributions with minimal transport cost.
Inspired by this observation, we propose Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion.
We demonstrate that our algorithm successfully conduct competitive I2I translation in unsupervised setting with only a fraction of cost required by previous DM-
arXiv Detail & Related papers (2024-11-22T11:24:14Z) - MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - On Exact Bit-level Reversible Transformers Without Changing Architectures [4.282029766809805]
reversible deep neural networks (DNNs) have been proposed to reduce memory consumption in the training process.
We present the BDIA-transformer, which is an exact bit-level reversible transformer that uses an unchanged standard architecture for inference.
arXiv Detail & Related papers (2024-07-12T08:42:58Z) - Binarized Diffusion Model for Image Super-Resolution [61.963833405167875]
Binarization, an ultra-compression algorithm, offers the potential for effectively accelerating advanced diffusion models (DMs)
Existing binarization methods result in significant performance degradation.
We introduce a novel binarized diffusion model, BI-DiffSR, for image SR.
arXiv Detail & Related papers (2024-06-09T10:30:25Z) - Look-Around Before You Leap: High-Frequency Injected Transformer for Image Restoration [46.96362010335177]
In this paper, we propose HIT, a simple yet effective High-frequency Injected Transformer for image restoration.
Specifically, we design a window-wise injection module (WIM), which incorporates abundant high-frequency details into the feature map, to provide reliable references for restoring high-quality images.
In addition, we introduce a spatial enhancement unit (SEU) to preserve essential spatial relationships that may be lost due to the computations carried out across channel dimensions in the BIM.
arXiv Detail & Related papers (2024-03-30T08:05:00Z) - MsDC-DEQ-Net: Deep Equilibrium Model (DEQ) with Multi-scale Dilated
Convolution for Image Compressive Sensing (CS) [0.0]
Compressive sensing (CS) is a technique that enables the recovery of sparse signals using fewer measurements than traditional sampling methods.
We develop an interpretable and concise neural network model for reconstructing natural images using CS.
The model, called MsDC-DEQ-Net, exhibits competitive performance compared to state-of-the-art network-based methods.
arXiv Detail & Related papers (2024-01-05T16:25:58Z) - Deep Equilibrium Diffusion Restoration with Parallel Sampling [120.15039525209106]
Diffusion model-based image restoration (IR) aims to use diffusion models to recover high-quality (HQ) images from degraded images, achieving promising performance.
Most existing methods need long serial sampling chains to restore HQ images step-by-step, resulting in expensive sampling time and high computation costs.
In this work, we aim to rethink the diffusion model-based IR models through a different perspective, i.e., a deep equilibrium (DEQ) fixed point system, called DeqIR.
arXiv Detail & Related papers (2023-11-20T08:27:56Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - A memory-efficient neural ODE framework based on high-level adjoint
differentiation [4.063868707697316]
We present a new neural ODE framework, PNODE, based on high-level discrete algorithmic differentiation.
We show that PNODE achieves the highest memory efficiency when compared with other reverse-accurate methods.
arXiv Detail & Related papers (2022-06-02T20:46:26Z) - Denoising Diffusion Restoration Models [110.1244240726802]
Denoising Diffusion Restoration Models (DDRM) is an efficient, unsupervised posterior sampling method.
We demonstrate DDRM's versatility on several image datasets for super-resolution, deblurring, inpainting, and colorization.
arXiv Detail & Related papers (2022-01-27T20:19:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.