Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model
- URL: http://arxiv.org/abs/2206.08368v3
- Date: Thu, 4 May 2023 10:21:05 GMT
- Title: Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model
- Authors: Erik C.M. Johnson and Marc Habermann and Soshi Shimada and Vladislav
Golyanik and Christian Theobalt
- Abstract summary: Capturing general deforming scenes from monocular RGB video is crucial for many computer graphics and vision applications.
Our method, Ub4D, handles large deformations, performs shape completion in occluded regions, and can operate on monocular RGB videos directly by using differentiable volume rendering.
Results on our new dataset, which will be made publicly available, demonstrate a clear improvement over the state of the art in terms of surface reconstruction accuracy and robustness to large deformations.
- Score: 76.64071133839862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Capturing general deforming scenes from monocular RGB video is crucial for
many computer graphics and vision applications. However, current approaches
suffer from drawbacks such as struggling with large scene deformations,
inaccurate shape completion or requiring 2D point tracks. In contrast, our
method, Ub4D, handles large deformations, performs shape completion in occluded
regions, and can operate on monocular RGB videos directly by using
differentiable volume rendering. This technique includes three new in the
context of non-rigid 3D reconstruction components, i.e., 1) A coordinate-based
and implicit neural representation for non-rigid scenes, which in conjunction
with differentiable volume rendering enables an unbiased reconstruction of
dynamic scenes, 2) a proof that extends the unbiased formulation of volume
rendering to dynamic scenes, and 3) a novel dynamic scene flow loss, which
enables the reconstruction of larger deformations by leveraging the coarse
estimates of other methods. Results on our new dataset, which will be made
publicly available, demonstrate a clear improvement over the state of the art
in terms of surface reconstruction accuracy and robustness to large
deformations.
Related papers
- Neural 4D Evolution under Large Topological Changes from 2D Images [5.678824325812255]
In this work, we address the challenges in extending 3D neural evolution to 4D under large topological changes.
We introduce (i) a new architecture to discretize and encode the deformation and learn the SDF and (ii) a technique to impose the temporal consistency.
To facilitate learning directly from 2D images, we propose a learning framework that can disentangle the geometry and appearance from RGB images.
arXiv Detail & Related papers (2024-11-22T15:47:42Z) - SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes [7.590932716513324]
We present SpectroMotion, a novel approach that combines 3D Gaussian Splatting (3DGS) with physically-based rendering (PBR) and deformation fields to reconstruct dynamic specular scenes.
arXiv Detail & Related papers (2024-10-22T17:59:56Z) - Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes.
First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes.
Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z) - Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking [52.393359791978035]
Motion2VecSets is a 4D diffusion model for dynamic surface reconstruction from point cloud sequences.
We parameterize 4D dynamics with latent sets instead of using global latent codes.
For more temporally-coherent object tracking, we synchronously denoise deformation latent sets and exchange information across multiple frames.
arXiv Detail & Related papers (2024-01-12T15:05:08Z) - Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture [47.44029968307207]
We propose a novel framework for simultaneous high-fidelity recovery of object shapes and textures from single-view images.
Our approach utilizes the proposed Single-view neural implicit Shape and Radiance field (SSR) representations to leverage both explicit 3D shape supervision and volume rendering.
A distinctive feature of our framework is its ability to generate fine-grained textured meshes while seamlessly integrating rendering capabilities into the single-view 3D reconstruction model.
arXiv Detail & Related papers (2023-11-01T11:46:15Z) - Neuralangelo: High-Fidelity Neural Surface Reconstruction [22.971952498343942]
We present Neuralangelo, which combines the representation power of multi-resolution 3D hash grids with neural surface rendering.
Even without auxiliary inputs such as depth, Neuralangelo can effectively recover dense 3D surface structures from multi-view images with fidelity significantly surpassing previous methods.
arXiv Detail & Related papers (2023-06-05T17:59:57Z) - NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos [82.74918564737591]
We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input.
Experiments show that our method achieves superior mesh and video reconstruction of dynamic scenes compared to competing Neural Field approaches.
arXiv Detail & Related papers (2022-10-22T04:57:55Z) - LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human
Modeling [69.56581851211841]
We propose a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD.
Our key insight is to encourage the network to learn the latent codes of local part-level representation.
LoRD has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications.
arXiv Detail & Related papers (2022-08-18T03:49:44Z) - Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular
Video Depth [90.33296913575818]
In some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.
We propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points.
Our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks.
arXiv Detail & Related papers (2022-02-03T08:52:54Z) - Learning monocular 3D reconstruction of articulated categories from
motion [39.811816510186475]
Video self-supervision forces the consistency of consecutive 3D reconstructions by a motion-based cycle loss.
We introduce an interpretable model of 3D template deformations that controls a 3D surface through the displacement of a small number of local, learnable handles.
We obtain state-of-the-art reconstructions with diverse shapes, viewpoints and textures for multiple articulated object categories.
arXiv Detail & Related papers (2021-03-30T13:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.