Related papers: Frame Interpolation with Consecutive Brownian Bridge Diffusion

Frame Interpolation with Consecutive Brownian Bridge Diffusion

URL: http://arxiv.org/abs/2405.05953v6
Date: Mon, 18 Nov 2024 08:53:41 GMT
Title: Frame Interpolation with Consecutive Brownian Bridge Diffusion
Authors: Zonglin Lyu, Ming Li, Jianbo Jiao, Chen Chen,
Abstract summary: Video Frame Interpolation (VFI) tries to formulate VFI as a diffusion-based conditional image generation problem. We propose our unique solution: Frame Interpolation with Consecutive Brownian Bridge Diffusion.
Score: 21.17973023413981
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work in Video Frame Interpolation (VFI) tries to formulate VFI as a diffusion-based conditional image generation problem, synthesizing the intermediate frame given a random noise and neighboring frames. Due to the relatively high resolution of videos, Latent Diffusion Models (LDMs) are employed as the conditional generation model, where the autoencoder compresses images into latent representations for diffusion and then reconstructs images from these latent representations. Such a formulation poses a crucial challenge: VFI expects that the output is deterministically equal to the ground truth intermediate frame, but LDMs randomly generate a diverse set of different images when the model runs multiple times. The reason for the diverse generation is that the cumulative variance (variance accumulated at each step of generation) of generated latent representations in LDMs is large. This makes the sampling trajectory random, resulting in diverse rather than deterministic generations. To address this problem, we propose our unique solution: Frame Interpolation with Consecutive Brownian Bridge Diffusion. Specifically, we propose consecutive Brownian Bridge diffusion that takes a deterministic initial value as input, resulting in a much smaller cumulative variance of generated latent representations. Our experiments suggest that our method can improve together with the improvement of the autoencoder and achieve state-of-the-art performance in VFI, leaving strong potential for further enhancement.

Related papers

DeltaDiff: Reality-Driven Diffusion with AnchorResiduals for Faithful SR [10.790771977682763]
We propose DeltaDiff, a novel frame.work that constrains the difusion process.<n>Our method surpasses state-of-the-art models and generates re-sults with better fidelity.<n>This work establishes a new low-rank constrained par-adigm for applying diffusion models to image reconstruction tasks.
arXiv Detail & Related papers (2025-02-18T06:07:14Z)
An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models [13.00429687431982]
Diffusion bridge models initialize the generative process from corrupted images instead of pure Gaussian noise. Existing diffusion bridge models often rely on Differential Equation samplers, which result in slower inference speed. We propose a high-order ODE sampler with a start for diffusion bridge models. Our method is fully compatible with pretrained diffusion bridge models and requires no additional training.
arXiv Detail & Related papers (2024-12-28T03:32:26Z)
Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation [58.19676004192321]
Diffusion models (DMs), which enable both image generation from noise and inversion from data, have inspired powerful unpaired image-to-image (I2I) translation algorithms. We tackle this problem with Schrodinger Bridges (SBs), which are differential equations (SDEs) between distributions with minimal transport cost. Inspired by this observation, we propose Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion. We demonstrate that our algorithm successfully conduct competitive I2I translation in unsupervised setting with only a fraction of cost required by previous DM-
arXiv Detail & Related papers (2024-11-22T11:24:14Z)
Solving Video Inverse Problems Using Image Diffusion Models [58.464465016269614]
We introduce an innovative video inverse solver that leverages only image diffusion models. Our method treats the time dimension of a video as the batch dimension image diffusion models. We also introduce a batch-consistent sampling strategy that encourages consistency across batches.
arXiv Detail & Related papers (2024-09-04T09:48:27Z)
Diffusion Bridge Implicit Models [25.213664260896103]
Denoising diffusion bridge models (DDBMs) are a powerful variant of diffusion models for interpolating between two arbitrary paired distributions. We take the first step in fast sampling of DDBMs without extra training, motivated by the well-established recipes in diffusion models. We induce a novel, simple, and insightful form of ordinary differential equation (ODE) which inspires high-order numerical solvers.
arXiv Detail & Related papers (2024-05-24T19:08:30Z)
Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes. For many applications such as image editing, the model input comes from a distribution that is not random noise. In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z)
Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance. We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z)
A Variational Perspective on Solving Inverse Problems with Diffusion Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data. This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z)
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM) Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z)
Deep Equilibrium Approaches to Diffusion Models [1.4275201654498746]
Diffusion-based generative models are extremely effective in generating high-quality images. These models typically require long sampling chains to produce high-fidelity images. We look at diffusion models through a different perspective, that of a (deep) equilibrium (DEQ) fixed point model.
arXiv Detail & Related papers (2022-10-23T22:02:19Z)
Image Generation with Multimodal Priors using Denoising Diffusion Probabilistic Models [54.1843419649895]
A major challenge in using generative models to accomplish this task is the lack of paired data containing all modalities and corresponding outputs. We propose a solution based on a denoising diffusion probabilistic synthesis models to generate images under multi-model priors.
arXiv Detail & Related papers (2022-06-10T12:23:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.