Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling
via Temporal Basis Learning
- URL: http://arxiv.org/abs/2111.10533v1
- Date: Sat, 20 Nov 2021 07:34:28 GMT
- Title: Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling
via Temporal Basis Learning
- Authors: Wenpeng Xing, Jie Chen
- Abstract summary: We propose a novel Temporal-MPI representation which is able to encode the rich 3D and dynamic variation information throughout the entire video as compact temporal basis.
Our proposed Temporal-MPI framework is able to generate a time-instance MPI with only 0.002 seconds, which is up to 3000 times faster, with 3dB higher average view-synthesis PSNR as compared with other state-of-the-art dynamic scene modelling frameworks.
- Score: 6.952039070065292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Novel view synthesis of static scenes has achieved remarkable advancements in
producing photo-realistic results. However, key challenges remain for immersive
rendering for dynamic contents. For example, one of the seminal image-based
rendering frameworks, the multi-plane image (MPI) produces high novel-view
synthesis quality for static scenes but faces difficulty in modeling dynamic
parts. In addition, modeling dynamic variations through MPI may require huge
storage space and long inference time, which hinders its application in
real-time scenarios. In this paper, we propose a novel Temporal-MPI
representation which is able to encode the rich 3D and dynamic variation
information throughout the entire video as compact temporal basis. Novel-views
at arbitrary time-instance will be able to be rendered real-time with high
visual quality due to the highly compact and expressive latent basis and the
coefficients jointly learned. We show that given comparable memory consumption,
our proposed Temporal-MPI framework is able to generate a time-instance MPI
with only 0.002 seconds, which is up to 3000 times faster, with 3dB higher
average view-synthesis PSNR as compared with other state-of-the-art dynamic
scene modelling frameworks.
Related papers
- CD-NGP: A Fast Scalable Continual Representation for Dynamic Scenes [9.217592165862762]
We propose continual dynamic neural graphics primitives (CD-NGP) for view synthesis.
Our approach synergizes features from both temporal and spatial hash encodings to achieve high rendering quality.
We introduce a novel dataset comprising multi-view, exceptionally long video sequences with substantial rigid and non-rigid motion.
arXiv Detail & Related papers (2024-09-08T17:35:48Z) - MultiDiff: Consistent Novel View Synthesis from a Single Image [60.04215655745264]
MultiDiff is a novel approach for consistent novel view synthesis of scenes from a single RGB image.
Our results demonstrate that MultiDiff outperforms state-of-the-art methods on the challenging, real-world datasets RealEstate10K and ScanNet.
arXiv Detail & Related papers (2024-06-26T17:53:51Z) - Modeling Ambient Scene Dynamics for Free-view Synthesis [31.233859111566613]
We introduce a novel method for dynamic free-view synthesis of an ambient scenes from a monocular capture.
Our method builds upon the recent advancements in 3D Gaussian Splatting (3DGS) that can faithfully reconstruct complex static scenes.
arXiv Detail & Related papers (2024-06-13T17:59:11Z) - Dynamic 3D Gaussian Fields for Urban Areas [60.64840836584623]
We present an efficient neural 3D scene representation for novel-view synthesis (NVS) in large-scale, dynamic urban areas.
We propose 4DGF, a neural scene representation that scales to large-scale dynamic urban areas.
arXiv Detail & Related papers (2024-06-05T12:07:39Z) - RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks [93.18404922542702]
We present a novel video generative model designed to address long-term spatial and temporal dependencies.
Our approach incorporates a hybrid explicit-implicit tri-plane representation inspired by 3D-aware generative frameworks.
Our model synthesizes high-fidelity video clips at a resolution of $256times256$ pixels, with durations extending to more than $5$ seconds at a frame rate of 30 fps.
arXiv Detail & Related papers (2024-01-11T16:48:44Z) - SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting [7.553079256251747]
We extend 3D Gaussian Splatting to reconstruct dynamic scenes.
We produce high-quality renderings of general dynamic scenes with competitive quantitative performance.
Our method can be viewed in real-time in our dynamic interactive viewer.
arXiv Detail & Related papers (2023-12-20T03:54:03Z) - DynMF: Neural Motion Factorization for Real-time Dynamic View Synthesis
with 3D Gaussian Splatting [35.69069478773709]
We argue that the per-point motions of a dynamic scene can be decomposed into a small set of explicit or learned trajectories.
Our representation is interpretable, efficient, and expressive enough to offer real-time view synthesis of complex dynamic scene motions.
arXiv Detail & Related papers (2023-11-30T18:59:11Z) - Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering [36.111845416439095]
We present a unified representation model, called Periodic Vibration Gaussian (PVG)
PVG builds upon the efficient 3D Gaussian splatting technique, originally designed for static scene representation.
PVG exhibits 900-fold acceleration in rendering over the best alternative.
arXiv Detail & Related papers (2023-11-30T13:53:50Z) - Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge
Transferring [82.84513669453744]
Image-text pretrained models, e.g., CLIP, have shown impressive general multi-modal knowledge learned from large-scale image-text data pairs.
We revisit temporal modeling in the context of image-to-video knowledge transferring.
We present a simple and effective temporal modeling mechanism extending CLIP model to diverse video tasks.
arXiv Detail & Related papers (2023-01-26T14:12:02Z) - DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene.
We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views.
We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z) - Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes [70.76742458931935]
We introduce a new representation that models the dynamic scene as a time-variant continuous function of appearance, geometry, and 3D scene motion.
Our representation is optimized through a neural network to fit the observed input views.
We show that our representation can be used for complex dynamic scenes, including thin structures, view-dependent effects, and natural degrees of motion.
arXiv Detail & Related papers (2020-11-26T01:23:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.