GD-VDM: Generated Depth for better Diffusion-based Video Generation
- URL: http://arxiv.org/abs/2306.11173v1
- Date: Mon, 19 Jun 2023 21:32:10 GMT
- Title: GD-VDM: Generated Depth for better Diffusion-based Video Generation
- Authors: Ariel Lapid, Idan Achituve, Lior Bracha, Ethan Fetaya
- Abstract summary: This paper proposes GD-VDM, a novel diffusion model for video generation, demonstrating promising results.
We evaluated GD-VDM on the Cityscapes dataset and found that it generates more diverse and complex scenes compared to natural baselines.
- Score: 18.039417502897486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The field of generative models has recently witnessed significant progress,
with diffusion models showing remarkable performance in image generation. In
light of this success, there is a growing interest in exploring the application
of diffusion models to other modalities. One such challenge is the generation
of coherent videos of complex scenes, which poses several technical
difficulties, such as capturing temporal dependencies and generating long,
high-resolution videos. This paper proposes GD-VDM, a novel diffusion model for
video generation, demonstrating promising results. GD-VDM is based on a
two-phase generation process involving generating depth videos followed by a
novel diffusion Vid2Vid model that generates a coherent real-world video. We
evaluated GD-VDM on the Cityscapes dataset and found that it generates more
diverse and complex scenes compared to natural baselines, demonstrating the
efficacy of our approach.
Related papers
- Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach [29.753974393652356]
We propose a frame-aware video diffusion model(FVDM)
Our approach allows each frame to follow an independent noise schedule, enhancing the model's capacity to capture fine-grained temporal dependencies.
Our empirical evaluations show that FVDM outperforms state-of-the-art methods in video generation quality, while also excelling in extended tasks.
arXiv Detail & Related papers (2024-10-04T05:47:39Z) - JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation [6.463753697299011]
We introduce the Joint Video-Image Diffusion model (JVID), a novel approach to generating high-quality temporally coherent videos.
Our results demonstrate quantitative and qualitative improvements in producing realistic and coherent videos.
arXiv Detail & Related papers (2024-09-21T13:59:50Z) - Vivid-ZOO: Multi-View Video Generation with Diffusion Model [76.96449336578286]
New challenges lie in the lack of massive captioned multi-view videos and the complexity of modeling such multi-dimensional distribution.
We propose a novel diffusion-based pipeline that generates high-quality multi-view videos centered around a dynamic 3D object from text.
arXiv Detail & Related papers (2024-06-12T21:44:04Z) - A Survey on Video Diffusion Models [103.03565844371711]
The recent wave of AI-generated content (AIGC) has witnessed substantial success in computer vision.
Due to their impressive generative capabilities, diffusion models are gradually superseding methods based on GANs and auto-regressive Transformers.
This paper presents a comprehensive review of video diffusion models in the AIGC era.
arXiv Detail & Related papers (2023-10-16T17:59:28Z) - LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion
Models [133.088893990272]
We learn a high-quality text-to-video (T2V) generative model by leveraging a pre-trained text-to-image (T2I) model as a basis.
We propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models.
arXiv Detail & Related papers (2023-09-26T17:52:03Z) - Video Probabilistic Diffusion Models in Projected Latent Space [75.4253202574722]
We propose a novel generative model for videos, coined projected latent video diffusion models (PVDM)
PVDM learns a video distribution in a low-dimensional latent space and thus can be efficiently trained with high-resolution videos under limited resources.
arXiv Detail & Related papers (2023-02-15T14:22:34Z) - VIDM: Video Implicit Diffusion Models [75.90225524502759]
Diffusion models have emerged as a powerful generative method for synthesizing high-quality and diverse set of images.
We propose a video generation method based on diffusion models, where the effects of motion are modeled in an implicit condition.
We improve the quality of the generated videos by proposing multiple strategies such as sampling space truncation, robustness penalty, and positional group normalization.
arXiv Detail & Related papers (2022-12-01T02:58:46Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.