TFS-NeRF: Template-Free NeRF for Semantic 3D Reconstruction of Dynamic Scene
- URL: http://arxiv.org/abs/2409.17459v2
- Date: Wed, 6 Nov 2024 09:50:06 GMT
- Title: TFS-NeRF: Template-Free NeRF for Semantic 3D Reconstruction of Dynamic Scene
- Authors: Sandika Biswas, Qianyi Wu, Biplab Banerjee, Hamid Rezatofighi,
- Abstract summary: This paper introduces a 3D semantic NeRF for dynamic scenes captured from sparse or singleview RGB videos.
Our framework uses an Invertible Neural Network (INN) for LBS prediction, the training process.
Our approach produces high-quality reconstructions of both deformable and non-deformable objects in complex interactions.
- Score: 25.164085646259856
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite advancements in Neural Implicit models for 3D surface reconstruction, handling dynamic environments with arbitrary rigid, non-rigid, or deformable entities remains challenging. Many template-based methods are entity-specific, focusing on humans, while generic reconstruction methods adaptable to such dynamic scenes often require additional inputs like depth or optical flow or rely on pre-trained image features for reasonable outcomes. These methods typically use latent codes to capture frame-by-frame deformations. In contrast, some template-free methods bypass these requirements and adopt traditional LBS (Linear Blend Skinning) weights for a detailed representation of deformable object motions, although they involve complex optimizations leading to lengthy training times. To this end, as a remedy, this paper introduces TFS-NeRF, a template-free 3D semantic NeRF for dynamic scenes captured from sparse or single-view RGB videos, featuring interactions among various entities and more time-efficient than other LBS-based approaches. Our framework uses an Invertible Neural Network (INN) for LBS prediction, simplifying the training process. By disentangling the motions of multiple entities and optimizing per-entity skinning weights, our method efficiently generates accurate, semantically separable geometries. Extensive experiments demonstrate that our approach produces high-quality reconstructions of both deformable and non-deformable objects in complex interactions, with improved training efficiency compared to existing methods.
Related papers
- MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [118.74385965694694]
We present Motion DUSt3R (MonST3R), a novel geometry-first approach that directly estimates per-timestep geometry from dynamic scenes.
By simply estimating a pointmap for each timestep, we can effectively adapt DUST3R's representation, previously only used for static scenes, to dynamic scenes.
We show that by posing the problem as a fine-tuning task, identifying several suitable datasets, and strategically training the model on this limited data, we can surprisingly enable the model to handle dynamics.
arXiv Detail & Related papers (2024-10-04T18:00:07Z) - Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models [4.529832252085145]
We propose a novel SfT reconstruction algorithm for cloth using a pre-trained neural surrogate model.
Differentiable rendering of the simulated mesh enables pixel-wise comparisons between the reconstruction and a target video sequence.
This allows to retain a precise, stable, and smooth reconstructed geometry while reducing the runtime by a factor of 400-500 compared to $phi$-SfT.
arXiv Detail & Related papers (2023-11-21T18:59:58Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - DeFormer: Integrating Transformers with Deformable Models for 3D Shape
Abstraction from a Single Image [31.154786931081087]
We propose a novel bi-channel Transformer architecture, integrated with parameterized deformable models, to simultaneously estimate the global and local deformations of primitives.
DeFormer achieves better reconstruction accuracy over the state-of-the-art, and visualizes with consistent semantic correspondences for improved interpretability.
arXiv Detail & Related papers (2023-09-22T02:46:43Z) - SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes [75.9110646062442]
We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner.
Our method takes multi-view RGB videos and background images from static cameras with known camera parameters as input.
We show experimentally that, unlike prior work that only handles small motion, our method enables the reconstruction of studio-scale motions.
arXiv Detail & Related papers (2023-08-16T09:50:35Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - SNUG: Self-Supervised Neural Dynamic Garments [14.83072352654608]
We present a self-supervised method to learn dynamic 3D deformations of garments worn by parametric human bodies.
This allows us to learn models for interactive garments, including dynamic deformations and fine wrinkles, with two orders of magnitude speed up in training time.
arXiv Detail & Related papers (2022-04-05T13:50:21Z) - {\phi}-SfT: Shape-from-Template with a Physics-Based Deformation Model [69.27632025495512]
Shape-from-Template (SfT) methods estimate 3D surface deformations from a single monocular RGB camera.
This paper proposes a new SfT approach explaining 2D observations through physical simulations.
arXiv Detail & Related papers (2022-03-22T17:59:57Z) - PaMIR: Parametric Model-Conditioned Implicit Representation for
Image-based Human Reconstruction [67.08350202974434]
We propose Parametric Model-Conditioned Implicit Representation (PaMIR), which combines the parametric body model with the free-form deep implicit function.
We show that our method achieves state-of-the-art performance for image-based 3D human reconstruction in the cases of challenging poses and clothing types.
arXiv Detail & Related papers (2020-07-08T02:26:19Z) - SoftSMPL: Data-driven Modeling of Nonlinear Soft-tissue Dynamics for
Parametric Humans [15.83525220631304]
We present SoftSMPL, a learning-based method to model realistic soft-tissue dynamics as a function of body shape and motion.
At the core of our method there are three key contributions that enable us to model highly realistic dynamics.
arXiv Detail & Related papers (2020-04-01T10:35:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.