STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in
Motion with Neural Rendering
- URL: http://arxiv.org/abs/2101.01602v1
- Date: Tue, 22 Dec 2020 23:45:28 GMT
- Title: STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in
Motion with Neural Rendering
- Authors: Wentao Yuan, Zhaoyang Lv, Tanner Schmidt, Steven Lovegrove
- Abstract summary: We present STaR, a novel method that performs Self-supervised Tracking and Reconstruction of dynamic scenes with rigid motion from multi-view RGB videos without any manual annotation.
We show that our method can render photorealistic novel views, where novelty is measured on both spatial and temporal axes.
- Score: 9.600908665766465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present STaR, a novel method that performs Self-supervised Tracking and
Reconstruction of dynamic scenes with rigid motion from multi-view RGB videos
without any manual annotation. Recent work has shown that neural networks are
surprisingly effective at the task of compressing many views of a scene into a
learned function which maps from a viewing ray to an observed radiance value
via volume rendering. Unfortunately, these methods lose all their predictive
power once any object in the scene has moved. In this work, we explicitly model
rigid motion of objects in the context of neural representations of radiance
fields. We show that without any additional human specified supervision, we can
reconstruct a dynamic scene with a single rigid object in motion by
simultaneously decomposing it into its two constituent parts and encoding each
with its own neural representation. We achieve this by jointly optimizing the
parameters of two neural radiance fields and a set of rigid poses which align
the two fields at each frame. On both synthetic and real world datasets, we
demonstrate that our method can render photorealistic novel views, where
novelty is measured on both spatial and temporal axes. Our factored
representation furthermore enables animation of unseen object motion.
Related papers
- Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene.
We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views.
We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Editable Free-viewpoint Video Using a Layered Neural Representation [35.44420164057911]
We propose the first approach for editable free-viewpoint video generation for large-scale dynamic scenes using only sparse 16 cameras.
The core of our approach is a new layered neural representation, where each dynamic entity including the environment itself is formulated into a space-time coherent neural layered radiance representation called ST-NeRF.
Experiments demonstrate the effectiveness of our approach to achieve high-quality, photo-realistic, and editable free-viewpoint video generation for dynamic scenes.
arXiv Detail & Related papers (2021-04-30T06:50:45Z) - Non-Rigid Neural Radiance Fields: Reconstruction and Novel View
Synthesis of a Dynamic Scene From Monocular Video [76.19076002661157]
Non-Rigid Neural Radiance Fields (NR-NeRF) is a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes.
We show that even a single consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views.
arXiv Detail & Related papers (2020-12-22T18:46:12Z) - D-NeRF: Neural Radiance Fields for Dynamic Scenes [72.75686949608624]
We introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain.
D-NeRF reconstructs images of objects under rigid and non-rigid motions from a camera moving around the scene.
We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions.
arXiv Detail & Related papers (2020-11-27T19:06:50Z) - Neural Scene Graphs for Dynamic Scenes [57.65413768984925]
We present the first neural rendering method that decomposes dynamic scenes into scene graphs.
We learn implicitly encoded scenes, combined with a jointly learned latent representation to describe objects with a single implicit function.
arXiv Detail & Related papers (2020-11-20T12:37:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.