4Diffusion: Multi-view Video Diffusion Model for 4D Generation
- URL: http://arxiv.org/abs/2405.20674v2
- Date: Tue, 22 Oct 2024 08:50:16 GMT
- Title: 4Diffusion: Multi-view Video Diffusion Model for 4D Generation
- Authors: Haiyu Zhang, Xinyuan Chen, Yaohui Wang, Xihui Liu, Yunhong Wang, Yu Qiao,
- Abstract summary: Current 4D generation methods have achieved noteworthy efficacy with the aid of advanced diffusion generative models.
We propose a novel 4D generation pipeline, namely 4Diffusion, aimed at generating spatial-temporally consistent 4D content from a monocular video.
- Score: 55.82208863521353
- License:
- Abstract: Current 4D generation methods have achieved noteworthy efficacy with the aid of advanced diffusion generative models. However, these methods lack multi-view spatial-temporal modeling and encounter challenges in integrating diverse prior knowledge from multiple diffusion models, resulting in inconsistent temporal appearance and flickers. In this paper, we propose a novel 4D generation pipeline, namely 4Diffusion, aimed at generating spatial-temporally consistent 4D content from a monocular video. We first design a unified diffusion model tailored for multi-view video generation by incorporating a learnable motion module into a frozen 3D-aware diffusion model to capture multi-view spatial-temporal correlations. After training on a curated dataset, our diffusion model acquires reasonable temporal consistency and inherently preserves the generalizability and spatial consistency of the 3D-aware diffusion model. Subsequently, we propose 4D-aware Score Distillation Sampling loss, which is based on our multi-view video diffusion model, to optimize 4D representation parameterized by dynamic NeRF. This aims to eliminate discrepancies arising from multiple diffusion models, allowing for generating spatial-temporally consistent 4D content. Moreover, we devise an anchor loss to enhance the appearance details and facilitate the learning of dynamic NeRF. Extensive qualitative and quantitative experiments demonstrate that our method achieves superior performance compared to previous methods.
Related papers
- Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models [116.31344506738816]
We present a novel framework, textbfDiffusion4D, for efficient and scalable 4D content generation.
We develop a 4D-aware video diffusion model capable of synthesizing orbital views of dynamic 3D assets.
Our method surpasses prior state-of-the-art techniques in terms of generation efficiency and 4D geometry consistency.
arXiv Detail & Related papers (2024-05-26T17:47:34Z) - SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance.
Our method surpasses existing methods in both quality and efficiency.
We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z) - Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Video and Multi-view Diffusion Models [6.738732514502613]
Diffusion$2$ is a novel framework for dynamic 3D content creation.
It reconciles the knowledge about geometric consistency and temporal smoothness from 3D models to directly sample dense multi-view images.
Experiments demonstrate the efficacy of our proposed framework in generating highly seamless and consistent 4D assets.
arXiv Detail & Related papers (2024-04-02T17:58:03Z) - AnimateMe: 4D Facial Expressions via Diffusion Models [72.63383191654357]
Recent advances in diffusion models have enhanced the capabilities of generative models in 2D animation.
We employ Graph Neural Networks (GNNs) as denoising diffusion models in a novel approach, formulating the diffusion process directly on the mesh space.
This facilitates the generation of facial deformations through a mesh-diffusion-based model.
arXiv Detail & Related papers (2024-03-25T21:40:44Z) - STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians [36.83603109001298]
STAG4D is a novel framework that combines pre-trained diffusion models with dynamic 3D Gaussian splatting for high-fidelity 4D generation.
We show that our method outperforms prior 4D generation works in rendering quality, spatial-temporal consistency, and generation robustness.
arXiv Detail & Related papers (2024-03-22T04:16:33Z) - Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking [52.393359791978035]
Motion2VecSets is a 4D diffusion model for dynamic surface reconstruction from point cloud sequences.
We parameterize 4D dynamics with latent sets instead of using global latent codes.
For more temporally-coherent object tracking, we synchronously denoise deformation latent sets and exchange information across multiple frames.
arXiv Detail & Related papers (2024-01-12T15:05:08Z) - CAD: Photorealistic 3D Generation via Adversarial Distillation [28.07049413820128]
We propose a novel learning paradigm for 3D synthesis that utilizes pre-trained diffusion models.
Our method unlocks the generation of high-fidelity and photorealistic 3D content conditioned on a single image and prompt.
arXiv Detail & Related papers (2023-12-11T18:59:58Z) - Consistent4D: Consistent 360{\deg} Dynamic Object Generation from
Monocular Video [15.621374353364468]
Consistent4D is a novel approach for generating 4D dynamic objects from uncalibrated monocular videos.
We cast the 360-degree dynamic object reconstruction as a 4D generation problem, eliminating the need for tedious multi-view data collection and camera calibration.
arXiv Detail & Related papers (2023-11-06T03:26:43Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.