NVFi: Neural Velocity Fields for 3D Physics Learning from Dynamic Videos
- URL: http://arxiv.org/abs/2312.06398v1
- Date: Mon, 11 Dec 2023 14:07:31 GMT
- Title: NVFi: Neural Velocity Fields for 3D Physics Learning from Dynamic Videos
- Authors: Jinxi Li, Ziyang Song, Bo Yang
- Abstract summary: We propose to simultaneously learn the geometry, appearance, and physical velocity of 3D scenes only from video frames.
We conduct extensive experiments on multiple datasets, demonstrating the superior performance of our method over all baselines.
- Score: 8.559809421797784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we aim to model 3D scene dynamics from multi-view videos.
Unlike the majority of existing works which usually focus on the common task of
novel view synthesis within the training time period, we propose to
simultaneously learn the geometry, appearance, and physical velocity of 3D
scenes only from video frames, such that multiple desirable applications can be
supported, including future frame extrapolation, unsupervised 3D semantic scene
decomposition, and dynamic motion transfer. Our method consists of three major
components, 1) the keyframe dynamic radiance field, 2) the interframe velocity
field, and 3) a joint keyframe and interframe optimization module which is the
core of our framework to effectively train both networks. To validate our
method, we further introduce two dynamic 3D datasets: 1) Dynamic Object
dataset, and 2) Dynamic Indoor Scene dataset. We conduct extensive experiments
on multiple datasets, demonstrating the superior performance of our method over
all baselines, particularly in the critical tasks of future frame extrapolation
and unsupervised 3D semantic scene decomposition.
Related papers
- Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding [54.859943475818234]
We present Motion4D, a novel framework that integrates 2D priors from foundation models into a unified 4D Gaussian Splatting representation.<n>Our method features a two-part iterative optimization framework: 1) Sequential optimization, which updates motion and semantic fields in consecutive stages to maintain local consistency, and 2) Global optimization, which jointly refines all attributes for long-term coherence.<n>Our method significantly outperforms both 2D foundation models and existing 3D-based approaches across diverse scene understanding tasks, including point-based tracking, video object segmentation, and novel view synthesis.
arXiv Detail & Related papers (2025-12-03T09:32:56Z) - 4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos [52.89084603734664]
We present 4D3R, a pose-free dynamic neural rendering framework that decouples static and dynamic components through a two-stage approach.<n>Our approach achieves up to 1.8dB PSNR improvement over state-of-the-art methods.
arXiv Detail & Related papers (2025-11-07T13:25:50Z) - TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos [7.616167860385134]
We propose a new framework named TRACE to model the motion physics of complex dynamic 3D scenes.<n>By formulating each 3D point as a rigid particle with size and orientation in space, we directly learn a translation rotation dynamics system for each particle.
arXiv Detail & Related papers (2025-08-13T13:43:01Z) - MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second [29.926373004694728]
MoVieS represents dynamic 3D scenes using pixel-aligned grids of Gaussian primitives.<n>MoVieS enables view synthesis, reconstruction and 3D point tracking within a single learning-based framework.
arXiv Detail & Related papers (2025-07-14T08:49:57Z) - Layered Motion Fusion: Lifting Motion Segmentation to 3D in Egocentric Videos [71.24593306228145]
We propose to improve dynamic segmentation in 3D by fusing motion segmentation predictions from a 2D-based model into layered radiance fields.<n>We address this issue through test-time refinement, which helps the model to focus on specific frames, thereby reducing the data complexity.<n>This demonstrates that 3D techniques can enhance 2D analysis even for dynamic phenomena in a challenging and realistic setting.
arXiv Detail & Related papers (2025-06-05T19:46:48Z) - MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [118.74385965694694]
We present Motion DUSt3R (MonST3R), a novel geometry-first approach that directly estimates per-timestep geometry from dynamic scenes.
By simply estimating a pointmap for each timestep, we can effectively adapt DUST3R's representation, previously only used for static scenes, to dynamic scenes.
We show that by posing the problem as a fine-tuning task, identifying several suitable datasets, and strategically training the model on this limited data, we can surprisingly enable the model to handle dynamics.
arXiv Detail & Related papers (2024-10-04T18:00:07Z) - Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - Shape of Motion: 4D Reconstruction from a Single Video [51.04575075620677]
We introduce a method capable of reconstructing generic dynamic scenes, featuring explicit, full-sequence-long 3D motion.
We exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE3 motion bases.
Our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes.
arXiv Detail & Related papers (2024-07-18T17:59:08Z) - AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
The 3D autodecoder framework embeds properties learned from the target dataset in the latent space.
We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z) - OD-NeRF: Efficient Training of On-the-Fly Dynamic Neural Radiance Fields [63.04781030984006]
Dynamic neural radiance fields (dynamic NeRFs) have demonstrated impressive results in novel view synthesis on 3D dynamic scenes.
We propose OD-NeRF to efficiently train and render dynamic NeRFs on-the-fly which instead is capable of streaming the dynamic scene.
Our algorithm can achieve an interactive speed of 6FPS training and rendering on synthetic dynamic scenes on-the-fly, and a significant speed-up compared to the state-of-the-art on real-world dynamic scenes.
arXiv Detail & Related papers (2023-05-24T07:36:47Z) - SUDS: Scalable Urban Dynamic Scenes [46.965165390077146]
We extend neural radiance fields (NeRFs) to dynamic large-scale urban scenes.
We factorize the scene into three separate hash table data structures to efficiently encode static, dynamic, and far-field radiance fields.
Our reconstructions can be scaled to tens of thousands of objects across 1.2 million frames from 1700 videos spanning geospatial footprints of hundreds of kilometers.
arXiv Detail & Related papers (2023-03-25T18:55:09Z) - Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
Temporal Sentence Grounding [61.57847727651068]
Temporal sentence grounding aims to localize a target segment in an untrimmed video semantically according to a given sentence query.
Most previous works focus on learning frame-level features of each whole frame in the entire video, and directly match them with the textual information.
We propose a novel Motion- and Appearance-guided 3D Semantic Reasoning Network (MA3SRN), which incorporates optical-flow-guided motion-aware, detection-based appearance-aware, and 3D-aware object-level features.
arXiv Detail & Related papers (2022-03-06T13:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.