Related papers: HexPlane: A Fast Representation for Dynamic Scenes

HexPlane: A Fast Representation for Dynamic Scenes

URL: http://arxiv.org/abs/2301.09632v2
Date: Mon, 27 Mar 2023 16:39:58 GMT
Title: HexPlane: A Fast Representation for Dynamic Scenes
Authors: Ang Cao, Justin Johnson
Abstract summary: We show that dynamic 3D scenes can be explicitly represented by six planes of learned features, leading to an elegant solution we call HexPlane. A HexPlane computes features for points in spacetime by fusing vectors extracted from each plane, which is highly efficient.
Score: 18.276921637560445
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modeling and re-rendering dynamic 3D scenes is a challenging task in 3D vision. Prior approaches build on NeRF and rely on implicit representations. This is slow since it requires many MLP evaluations, constraining real-world applications. We show that dynamic 3D scenes can be explicitly represented by six planes of learned features, leading to an elegant solution we call HexPlane. A HexPlane computes features for points in spacetime by fusing vectors extracted from each plane, which is highly efficient. Pairing a HexPlane with a tiny MLP to regress output colors and training via volume rendering gives impressive results for novel view synthesis on dynamic scenes, matching the image quality of prior work but reducing training time by more than $100\times$. Extensive ablations confirm our HexPlane design and show that it is robust to different feature fusion mechanisms, coordinate systems, and decoding mechanisms. HexPlane is a simple and effective solution for representing 4D volumes, and we hope they can broadly contribute to modeling spacetime for dynamic 3D scenes.

Related papers

Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces [113.91791599146786]
We introduce the task of predicting functional 3D scene graphs for real-world indoor environments from posed RGB-D images. Unlike traditional 3D scene graphs that focus on spatial relationships of objects, functional 3D scene graphs capture objects, interactive elements, and their functional relationships. We evaluate our approach on an extended SceneFun3D dataset and a newly collected dataset, FunGraph3D, both annotated with functional 3D scene graphs.
arXiv Detail & Related papers (2025-03-24T22:53:19Z)
HexPlane Representation for 3D Semantic Scene Understanding [30.981224729759607]
HexPlane representation for 3D semantic scene understanding. Features of six planes are extracted by the 2D encoder and sent to the HexPlane Association Module. Method can be seamlessly integrated into existing voxel-based, point-based, and range-based approaches.
arXiv Detail & Related papers (2025-03-07T04:18:55Z)
SliceOcc: Indoor 3D Semantic Occupancy Prediction with Vertical Slice Representation [50.420711084672966]
We present SliceOcc, an RGB camera-based model specifically tailored for indoor 3D semantic occupancy prediction. Experimental results on the EmbodiedScan dataset demonstrate that SliceOcc achieves a mIoU of 15.45% across 81 indoor categories.
arXiv Detail & Related papers (2025-01-28T03:41:24Z)
4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives [115.67081491747943]
Dynamic 3D scene representation and novel view synthesis are crucial for enabling AR/VR and metaverse applications.<n>We reformulate the reconstruction of a time-varying 3D scene as approximating its underlying 4D volume.<n>We derive several compact variants that effectively reduce the memory footprint to address its storage bottleneck.
arXiv Detail & Related papers (2024-12-30T05:30:26Z)
Fused-Planes: Improving Planar Representations for Learning Large Sets of 3D Scenes [8.847448988112903]
We introduce Fused-Planes, a new planar architecture that improves Tri-Planes resource-efficiency in the framework of learning large sets of scenes. Our method divides it into two subsets and operates as follows: (i) we train the first subset of scenes jointly with a compression model, (ii) we use that compression model to learn the remaining scenes. This compression model consists of a 3D-aware latent space in which Fused-Planes are learned, enabling a reduced rendering resolution, and shared structures across scenes that reduce scene representation complexity.
arXiv Detail & Related papers (2024-10-31T08:58:00Z)
DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes [61.07023022220073]
We introduce DynamicCity, a novel 4D LiDAR generation framework capable of generating large-scale, high-quality LiDAR scenes. In particular, DynamicCity employs a novel Projection Module to effectively compress 4D LiDAR features into six 2D feature maps for HexPlane construction. In particular, a Padded Rollout Operation is proposed to reorganize all six feature planes of the HexPlane as a squared 2D feature map.
arXiv Detail & Related papers (2024-10-23T17:59:58Z)
DaRePlane: Direction-aware Representations for Dynamic Scene Reconstruction [26.39519157164198]
We present DaRePlane, a novel representation approach that captures dynamics from six different directions. DaRePlane yields state-of-the-art performance in novel view synthesis for various complex dynamic scenes.
arXiv Detail & Related papers (2024-10-18T04:19:10Z)
OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos [7.616167860385134]
It has long been challenging to recover the underlying dynamic 3D scene representations from a monocular RGB video. We introduce a new framework, called OSN, to learn all plausible 3D scene configurations that match the input video. Our method demonstrates a clear advantage in learning fine-grained 3D scene geometry.
arXiv Detail & Related papers (2024-07-08T05:03:46Z)
BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation [96.58789785954409]
We propose a practical and efficient 3D representation that incorporates an equivariant radiance field with the guidance of a bird's-eye view map. We produce large-scale, even infinite-scale, 3D scenes via synthesizing local scenes and then stitching them with smooth consistency.
arXiv Detail & Related papers (2023-12-04T18:56:10Z)
Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes [69.52540205439989]
We introduce Im4D, a hybrid representation that consists of a grid-based geometry representation and a multi-view image-based appearance representation. We represent the scene appearance by the original multi-view videos and a network that learns to predict the color of a 3D point from image features. We show that Im4D state-of-the-art performance in rendering quality and can be trained efficiently, while realizing real-time rendering with a speed of 79.8 FPS for 512x512 images.
arXiv Detail & Related papers (2023-10-12T17:59:57Z)
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering [103.32717396287751]
We propose 4D Gaussian Splatting (4D-GS) as a holistic representation for dynamic scenes. A neuralvoxel encoding algorithm inspired by HexPlane is proposed to efficiently build features from 4D neural voxels. Our 4D-GS method achieves real-time rendering under high resolutions, 82 FPS at an 800$times$800 resolution on an 3090 GPU.
arXiv Detail & Related papers (2023-10-12T17:21:41Z)
Incremental 3D Semantic Scene Graph Prediction from RGB Sequences [86.77318031029404]
We propose a real-time framework that incrementally builds a consistent 3D semantic scene graph of a scene given an RGB image sequence. Our method consists of a novel incremental entity estimation pipeline and a scene graph prediction network. The proposed network estimates 3D semantic scene graphs with iterative message passing using multi-view and geometric features extracted from the scene entities.
arXiv Detail & Related papers (2023-05-04T11:32:16Z)
K-Planes: Explicit Radiance Fields in Space, Time, and Appearance [32.78595254330191]
We introduce k-planes, a white-box model for radiance fields in arbitrary dimensions. Our model uses d choose 2 planes to represent a d-dimensional scene, providing a seamless way to go from static to dynamic scenes. Across a range of synthetic and real, static and dynamic, fixed and varying appearance scenes, k-planes yields competitive and often state-of-the-art reconstruction fidelity.
arXiv Detail & Related papers (2023-01-24T18:59:08Z)
VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids [42.74658047803192]
State-of-the-art 3D-aware generative models rely on coordinate-based parameterize 3D radiance fields. Existing approaches often render low-resolution feature maps and process them with an upsampling network to obtain the final image. In contrast to existing approaches, our method requires only a single forward pass to generate a full 3D scene.
arXiv Detail & Related papers (2022-06-15T17:44:22Z)
Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.