Related papers: Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation

Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation

URL: http://arxiv.org/abs/2405.16849v3
Date: Sun, 7 Jul 2024 15:12:46 GMT
Title: Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation
Authors: Zhoujie Fu, Jiacheng Wei, Wenhao Shen, Chaoyue Song, Xiaofeng Yang, Fayao Liu, Xulei Yang, Guosheng Lin,
Abstract summary: We introduce a novel approach for creating controllable dynamics in 3D-generated Gaussians using casually captured reference videos. Our method transfers the motion of objects from reference videos to a variety of generated 3D Gaussians across different categories. Our technique offers specific and high-quality motion transfer, maintaining both shape integrity and temporal consistency.
Score: 47.203483017875726
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we introduce a novel approach for creating controllable dynamics in 3D-generated Gaussians using casually captured reference videos. Our method transfers the motion of objects from reference videos to a variety of generated 3D Gaussians across different categories, ensuring precise and customizable motion transfer. We achieve this by employing blend skinning-based non-parametric shape reconstruction to extract the shape and motion of reference objects. This process involves segmenting the reference objects into motion-related parts based on skinning weights and establishing shape correspondences with generated target shapes. To address shape and temporal inconsistencies prevalent in existing methods, we integrate physical simulation, driving the target shapes with matched motion. This integration is optimized through a displacement loss to ensure reliable and genuine dynamics. Our approach supports diverse reference inputs, including humans, quadrupeds, and articulated objects, and can generate dynamics of arbitrary length, providing enhanced fidelity and applicability. Unlike methods heavily reliant on diffusion video generation models, our technique offers specific and high-quality motion transfer, maintaining both shape integrity and temporal consistency.

Related papers

Articulate That Object Part (ATOP): 3D Part Articulation via Text and Motion Personalization [9.231848716070257]
ATOP (Articulate That Object Part) is a novel few-shot method based on motion personalization to articulate a static 3D object. We show that our method is capable of generating realistic motion videos and predicting 3D motion parameters in a more accurate and generalizable way.
arXiv Detail & Related papers (2025-02-11T05:47:16Z)
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach [42.581066866708085]
We present a novel video generation framework that integrates 3-dimensional geometry and dynamic awareness. To achieve this, we augment 2D videos with 3D point trajectories and align them in pixel space. The resulting 3D-aware video dataset, PointVid, is then used to fine-tune a latent diffusion model. We regularize the shape and motion of objects in the video to eliminate undesired artifacts.
arXiv Detail & Related papers (2025-02-05T21:49:06Z)
PhysMotion: Physics-Grounded Dynamics From a Single Image [24.096925413047217]
We introduce PhysMotion, a novel framework that leverages principled physics-based simulations to guide intermediate 3D representations generated from a single image. Our approach addresses the limitations of traditional data-driven generative models and result in more consistent physically plausible motions.
arXiv Detail & Related papers (2024-11-26T07:59:11Z)
EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone. We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z)
SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance. Our method surpasses existing methods in both quality and efficiency. We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z)
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance [25.346255905155424]
We introduce a methodology for human image animation by leveraging a 3D human parametric model within a latent diffusion framework. By representing the 3D human parametric model as the motion guidance, we can perform parametric shape alignment of the human body between the reference image and the source video motion. Our approach also exhibits superior generalization capabilities on the proposed in-the-wild dataset.
arXiv Detail & Related papers (2024-03-21T18:52:58Z)
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis [58.5779956899918]
We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians. We demonstrate a large number of downstream applications enabled by our representation, including first-person view synthesis, dynamic compositional scene synthesis, and 4D video editing.
arXiv Detail & Related papers (2023-08-18T17:59:21Z)
NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos [82.74918564737591]
We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input. Experiments show that our method achieves superior mesh and video reconstruction of dynamic scenes compared to competing Neural Field approaches.
arXiv Detail & Related papers (2022-10-22T04:57:55Z)
Differentiable Dynamics for Articulated 3d Human Motion Reconstruction [29.683633237503116]
We introduce DiffPhy, a differentiable physics-based model for articulated 3d human motion reconstruction from video. We validate the model by demonstrating that it can accurately reconstruct physically plausible 3d human motion from monocular video.
arXiv Detail & Related papers (2022-05-24T17:58:37Z)
Hierarchical Style-based Networks for Motion Synthesis [150.226137503563]
We propose a self-supervised method for generating long-range, diverse and plausible behaviors to achieve a specific goal location. Our proposed method learns to model the motion of human by decomposing a long-range generation task in a hierarchical manner. On large-scale skeleton dataset, we show that the proposed method is able to synthesise long-range, diverse and plausible motion.
arXiv Detail & Related papers (2020-08-24T02:11:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.