Related papers: D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video

D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video

URL: http://arxiv.org/abs/2406.10078v1
Date: Fri, 14 Jun 2024 14:35:44 GMT
Title: D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video
Authors: Moritz Kappel, Florian Hahlbohm, Timon Scholz, Susana Castillo, Christian Theobalt, Martin Eisemann, Vladislav Golyanik, Marcus Magnor,
Abstract summary: This paper contributes to the field by introducing a new synthesis method for dynamic novel view from monocular video, such as smartphone captures. Our approach represents the as a $textitdynamic neural point cloud$, an implicit time-conditioned point cloud that encodes local geometry and appearance in separate hash-encoded neural feature grids.
Score: 53.83936023443193
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Dynamic reconstruction and spatiotemporal novel-view synthesis of non-rigidly deforming scenes recently gained increased attention. While existing work achieves impressive quality and performance on multi-view or teleporting camera setups, most methods fail to efficiently and faithfully recover motion and appearance from casual monocular captures. This paper contributes to the field by introducing a new method for dynamic novel view synthesis from monocular video, such as casual smartphone captures. Our approach represents the scene as a $\textit{dynamic neural point cloud}$, an implicit time-conditioned point distribution that encodes local geometry and appearance in separate hash-encoded neural feature grids for static and dynamic regions. By sampling a discrete point cloud from our model, we can efficiently render high-quality novel views using a fast differentiable rasterizer and neural rendering network. Similar to recent work, we leverage advances in neural scene analysis by incorporating data-driven priors like monocular depth estimation and object segmentation to resolve motion and depth ambiguities originating from the monocular captures. In addition to guiding the optimization process, we show that these priors can be exploited to explicitly initialize our scene representation to drastically improve optimization speed and final image quality. As evidenced by our experimental evaluation, our dynamic point cloud model not only enables fast optimization and real-time frame rates for interactive applications, but also achieves competitive image quality on monocular benchmark sequences. Our project page is available at https://moritzkappel.github.io/projects/dnpc.

Related papers

CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video [25.551944406980297]
We propose a novel approach to generate high-quality novel views from monocular videos of complex and dynamic scenes. We introduce a module that operates in both the time and frequency domains to aggregate the features of object motion. Our experiments demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2024-01-10T00:40:05Z)
Fast Non-Rigid Radiance Fields from Monocularized Data [66.74229489512683]
This paper proposes a new method for full 360deg inward-facing novel view synthesis of non-rigidly deforming scenes. At the core of our method are 1) An efficient deformation module that decouples the processing of spatial and temporal information for accelerated training and inference; and 2) A static module representing the canonical scene as a fast hash-encoded neural radiance field. In both cases, our method is significantly faster than previous methods, converging in less than 7 minutes and achieving real-time framerates at 1K resolution, while obtaining a higher visual accuracy for generated novel views.
arXiv Detail & Related papers (2022-12-02T18:51:10Z)
DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene. We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views. We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z)
Human Performance Modeling and Rendering via Neural Animated Mesh [40.25449482006199]
We bridge the traditional mesh with a new class of neural rendering. In this paper, we present a novel approach for rendering human views from video. We demonstrate our approach on various platforms, inserting virtual human performances into AR headsets.
arXiv Detail & Related papers (2022-09-18T03:58:00Z)
Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes. We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature. We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z)
NPBG++: Accelerating Neural Point-Based Graphics [14.366073496519139]
NPBG++ is a novel view synthesis (NVS) task that achieves high rendering realism with low scene fitting time. Our method efficiently leverages the multiview observations and the point cloud of a static scene to predict a neural descriptor for each point. In our comparisons, the proposed system outperforms previous NVS approaches in terms of fitting and rendering runtimes while producing images of similar quality.
arXiv Detail & Related papers (2022-03-24T19:59:39Z)
Dynamic View Synthesis from Dynamic Monocular Video [69.80425724448344]
We present an algorithm for generating views at arbitrary viewpoints and any input time step given a monocular video of a dynamic scene. We show extensive quantitative and qualitative results of dynamic view synthesis from casually captured videos.
arXiv Detail & Related papers (2021-05-13T17:59:50Z)
Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video [76.19076002661157]
Non-Rigid Neural Radiance Fields (NR-NeRF) is a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes. We show that even a single consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views.
arXiv Detail & Related papers (2020-12-22T18:46:12Z)
Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes [70.76742458931935]
We introduce a new representation that models the dynamic scene as a time-variant continuous function of appearance, geometry, and 3D scene motion. Our representation is optimized through a neural network to fit the observed input views. We show that our representation can be used for complex dynamic scenes, including thin structures, view-dependent effects, and natural degrees of motion.
arXiv Detail & Related papers (2020-11-26T01:23:44Z)
Neural Human Video Rendering by Learning Dynamic Textures and Rendering-to-Video Translation [99.64565200170897]
We propose a novel human video synthesis method by explicitly disentangling the learning of time-coherent fine-scale details from the embedding of the human in 2D screen space. We show several applications of our approach, such as human reenactment and novel view synthesis from monocular video, where we show significant improvement over the state of the art both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-01-14T18:06:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.