Diffusion Probabilistic Modeling for Video Generation
- URL: http://arxiv.org/abs/2203.09481v1
- Date: Wed, 16 Mar 2022 03:52:45 GMT
- Title: Diffusion Probabilistic Modeling for Video Generation
- Authors: Ruihan Yang, Prakhar Srivastava, Stephan Mandt
- Abstract summary: Denoising diffusion probabilistic models are a promising new class of generative models that are competitive with GANs on perceptual metrics.
Inspired by recent advances in neural video compression, we use denoising diffusion models to generate a residual baseline to a deterministic next-frame prediction.
We find significant improvements in terms of perceptual quality on all data and improvements in terms of frame forecasting for complex high-resolution videos.
- Score: 17.48026395867434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Denoising diffusion probabilistic models are a promising new class of
generative models that are competitive with GANs on perceptual metrics. In this
paper, we explore their potential for sequentially generating video. Inspired
by recent advances in neural video compression, we use denoising diffusion
models to stochastically generate a residual to a deterministic next-frame
prediction. We compare this approach to two sequential VAE and two GAN
baselines on four datasets, where we test the generated frames for perceptual
quality and forecasting accuracy against ground truth frames. We find
significant improvements in terms of perceptual quality on all data and
improvements in terms of frame forecasting for complex high-resolution videos.
Related papers
- Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - Adversarial Training of Denoising Diffusion Model Using Dual
Discriminators for High-Fidelity Multi-Speaker TTS [0.0]
The diffusion model is capable of generating high-quality data through a probabilistic approach.
It suffers from the drawback of slow generation speed due to the requirement of a large number of time steps.
We propose a speech synthesis model with two discriminators: a diffusion discriminator for learning the distribution of the reverse process and a spectrogram discriminator for learning the distribution of the generated data.
arXiv Detail & Related papers (2023-08-03T07:22:04Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - VideoFusion: Decomposed Diffusion Models for High-Quality Video
Generation [88.49030739715701]
This work presents a decomposed diffusion process via resolving the per-frame noise into a base noise that is shared among all frames and a residual noise that varies along the time axis.
Experiments on various datasets confirm that our approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based alternatives in high-quality video generation.
arXiv Detail & Related papers (2023-03-15T02:16:39Z) - Insights from Generative Modeling for Neural Video Compression [31.59496634465347]
We present newly proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling.
We propose several architectures that yield state-of-the-art video compression performance on high-resolution video.
We provide further evidence that the generative modeling viewpoint can advance the neural video coding field.
arXiv Detail & Related papers (2021-07-28T02:19:39Z) - Variational Diffusion Models [33.0719137062396]
We introduce a family of diffusion-based generative models that obtain state-of-the-art likelihoods on image density estimation benchmarks.
We show that the variational lower bound (VLB) simplifies to a remarkably short expression in terms of the signal-to-noise ratio of the diffused data.
arXiv Detail & Related papers (2021-07-01T17:43:20Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Score-Based Generative Modeling through Stochastic Differential
Equations [114.39209003111723]
We present a differential equation that transforms a complex data distribution to a known prior distribution by injecting noise.
A corresponding reverse-time SDE transforms the prior distribution back into the data distribution by slowly removing the noise.
By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks.
We demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
arXiv Detail & Related papers (2020-11-26T19:39:10Z) - Denoising Diffusion Probabilistic Models [91.94962645056896]
We present high quality image synthesis results using diffusion probabilistic models.
Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics.
arXiv Detail & Related papers (2020-06-19T17:24:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.