Grid4D: 4D Decomposed Hash Encoding for High-fidelity Dynamic Gaussian Splatting
- URL: http://arxiv.org/abs/2410.20815v1
- Date: Mon, 28 Oct 2024 08:02:34 GMT
- Title: Grid4D: 4D Decomposed Hash Encoding for High-fidelity Dynamic Gaussian Splatting
- Authors: Jiawei Xu, Zexin Fan, Jian Yang, Jin Xie,
- Abstract summary: We propose Grid4D, a dynamic scene rendering model based on Gaussian splatting.
We decompose the 4D encoding into one spatial and three temporal 3D hash encodings without the low-rank assumption.
Our experiments demonstrate that Grid4D significantly outperforms the state-of-the-art models in visual quality and rendering speed.
- Score: 21.47981274362659
- License:
- Abstract: Recently, Gaussian splatting has received more and more attention in the field of static scene rendering. Due to the low computational overhead and inherent flexibility of explicit representations, plane-based explicit methods are popular ways to predict deformations for Gaussian-based dynamic scene rendering models. However, plane-based methods rely on the inappropriate low-rank assumption and excessively decompose the space-time 4D encoding, resulting in overmuch feature overlap and unsatisfactory rendering quality. To tackle these problems, we propose Grid4D, a dynamic scene rendering model based on Gaussian splatting and employing a novel explicit encoding method for the 4D input through the hash encoding. Different from plane-based explicit representations, we decompose the 4D encoding into one spatial and three temporal 3D hash encodings without the low-rank assumption. Additionally, we design a novel attention module that generates the attention scores in a directional range to aggregate the spatial and temporal features. The directional attention enables Grid4D to more accurately fit the diverse deformations across distinct scene components based on the spatial encoded features. Moreover, to mitigate the inherent lack of smoothness in explicit representation methods, we introduce a smooth regularization term that keeps our model from the chaos of deformation prediction. Our experiments demonstrate that Grid4D significantly outperforms the state-of-the-art models in visual quality and rendering speed.
Related papers
- DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation [10.250715657201363]
We introduce DreamMesh4D, a novel framework combining mesh representation with geometric skinning technique to generate high-quality 4D object from a monocular video.
Our method is compatible with modern graphic pipelines, showcasing its potential in the 3D gaming and film industry.
arXiv Detail & Related papers (2024-10-09T10:41:08Z) - GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
3D Gaussian Splatting (3DGS) allows for the compact encoding of both 3D geometry and scene appearance with its spatial features.
We propose distilling dense keypoint descriptors into 3DGS to improve the model's spatial understanding.
Our approach surpasses state-of-the-art Neural Render Pose (NRP) methods, including NeRFMatch and PNeRFLoc.
arXiv Detail & Related papers (2024-09-24T23:18:32Z) - A Refined 3D Gaussian Representation for High-Quality Dynamic Scene Reconstruction [2.022451212187598]
In recent years, Neural Radiance Fields (NeRF) has revolutionized three-dimensional (3D) reconstruction with its implicit representation.
3D Gaussian Splatting (3D-GS) has departed from the implicit representation of neural networks and instead directly represents scenes as point clouds with Gaussian-shaped distributions.
This paper purposes a refined 3D Gaussian representation for high-quality dynamic scene reconstruction.
Experimental results demonstrate that our method surpasses existing approaches in rendering quality and speed, while significantly reducing the memory usage associated with 3D-GS.
arXiv Detail & Related papers (2024-05-28T07:12:22Z) - SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance.
Our method surpasses existing methods in both quality and efficiency.
We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z) - latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction [48.86083272054711]
latentSplat is a method to predict semantic Gaussians in a 3D latent space that can be splatted and decoded by a light-weight generative 2D architecture.
We show that latentSplat outperforms previous works in reconstruction quality and generalization, while being fast and scalable to high-resolution data.
arXiv Detail & Related papers (2024-03-24T20:48:36Z) - Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking [52.393359791978035]
Motion2VecSets is a 4D diffusion model for dynamic surface reconstruction from point cloud sequences.
We parameterize 4D dynamics with latent sets instead of using global latent codes.
For more temporally-coherent object tracking, we synchronously denoise deformation latent sets and exchange information across multiple frames.
arXiv Detail & Related papers (2024-01-12T15:05:08Z) - Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle [9.082693946898733]
We introduce a novel point-based approach for fast dynamic scene reconstruction and real-time rendering from both multi-view and monocular videos.
In contrast to the prevalent NeRF-based approaches hampered by slow training and rendering speeds, our approach harnesses recent advancements in point-based 3D Gaussian Splatting (3DGS)
Our proposed approach showcases a substantial efficiency improvement, achieving a $5times$ faster training speed compared to the per-frame 3DGS modeling.
arXiv Detail & Related papers (2023-12-06T11:25:52Z) - Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering [71.44349029439944]
Recent 3D Gaussian Splatting method has achieved the state-of-the-art rendering quality and speed.
We introduce Scaffold-GS, which uses anchor points to distribute local 3D Gaussians.
We show that our method effectively reduces redundant Gaussians while delivering high-quality rendering.
arXiv Detail & Related papers (2023-11-30T17:58:57Z) - 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering [103.32717396287751]
We propose 4D Gaussian Splatting (4D-GS) as a holistic representation for dynamic scenes.
A neuralvoxel encoding algorithm inspired by HexPlane is proposed to efficiently build features from 4D neural voxels.
Our 4D-GS method achieves real-time rendering under high resolutions, 82 FPS at an 800$times$800 resolution on an 3090 GPU.
arXiv Detail & Related papers (2023-10-12T17:21:41Z) - Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene
Reconstruction [29.83056271799794]
Implicit neural representation has paved the way for new approaches to dynamic scene reconstruction and rendering.
We propose a deformable 3D Gaussians Splatting method that reconstructs scenes using 3D Gaussians and learns them in canonical space.
Through a differential Gaussianizer, the deformable 3D Gaussians not only achieve higher rendering quality but also real-time rendering speed.
arXiv Detail & Related papers (2023-09-22T16:04:02Z) - LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human
Modeling [69.56581851211841]
We propose a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD.
Our key insight is to encourage the network to learn the latent codes of local part-level representation.
LoRD has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications.
arXiv Detail & Related papers (2022-08-18T03:49:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.