Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences
- URL: http://arxiv.org/abs/2211.07459v2
- Date: Thu, 4 Apr 2024 08:24:54 GMT
- Title: Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences
- Authors: Yuxin Huang, Andong Yang, Zirui Wu, Yuantao Chen, Runyi Yang, Zhenxin Zhu, Chao Hou, Hao Zhao, Guyue Zhou,
- Abstract summary: We propose a novel time-pose function, which is an implicit network that maps timestamps to $rm SE(3)$ elements.
Our algorithm consists of three steps: (1) time-pose function fitting, (2) radiance field bootstrapping, (3) joint pose error compensation and radiance field refinement.
We also show qualitatively improved results on a real-world asynchronous RGB-D sequence captured by drone.
- Score: 12.799443250845224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been shown that learning radiance fields with depth rendering and depth supervision can effectively promote the quality and convergence of view synthesis. However, this paradigm requires input RGB-D sequences to be synchronized, hindering its usage in the UAV city modeling scenario. As there exists asynchrony between RGB images and depth images due to high-speed flight, we propose a novel time-pose function, which is an implicit network that maps timestamps to $\rm SE(3)$ elements. To simplify the training process, we also design a joint optimization scheme to jointly learn the large-scale depth-regularized radiance fields and the time-pose function. Our algorithm consists of three steps: (1) time-pose function fitting, (2) radiance field bootstrapping, (3) joint pose error compensation and radiance field refinement. In addition, we propose a large synthetic dataset with diverse controlled mismatches and ground truth to evaluate this new problem setting systematically. Through extensive experiments, we demonstrate that our method outperforms baselines without regularization. We also show qualitatively improved results on a real-world asynchronous RGB-D sequence captured by drone. Codes, data, and models will be made publicly available.
Related papers
- EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.
We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - Discovering an Image-Adaptive Coordinate System for Photography Processing [51.164345878060956]
We propose a novel algorithm, IAC, to learn an image-adaptive coordinate system in the RGB color space before performing curve operations.
This end-to-end trainable approach enables us to efficiently adjust images with a jointly learned image-adaptive coordinate system and curves.
arXiv Detail & Related papers (2025-01-11T06:20:07Z) - Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering [37.48219196092378]
We propose an efficient radiance field rendering algorithm that incorporates a synthesis process on adaptive sparse voxels without neural networks or 3D Gaussians.
Our method improves the previous neural-free voxel model by over 4db PSNR and more than 10x FPS speedup.
Our voxel representation is seamlessly compatible with grid-based 3D processing techniques such as Volume Fusion, Voxel Pooling, and Marching Cubes.
arXiv Detail & Related papers (2024-12-05T18:59:11Z) - GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views [67.34073368933814]
We propose a generalizable Gaussian Splatting approach for high-resolution image rendering under a sparse-view camera setting.
We train our Gaussian parameter regression module on human-only data or human-scene data, jointly with a depth estimation module to lift 2D parameter maps to 3D space.
Experiments on several datasets demonstrate that our method outperforms state-of-the-art methods while achieving an exceeding rendering speed.
arXiv Detail & Related papers (2024-11-18T08:18:44Z) - GUS-IR: Gaussian Splatting with Unified Shading for Inverse Rendering [83.69136534797686]
We present GUS-IR, a novel framework designed to address the inverse rendering problem for complicated scenes featuring rough and glossy surfaces.
This paper starts by analyzing and comparing two prominent shading techniques popularly used for inverse rendering, forward shading and deferred shading.
We propose a unified shading solution that combines the advantages of both techniques for better decomposition.
arXiv Detail & Related papers (2024-11-12T01:51:05Z) - Visual SLAM with 3D Gaussian Primitives and Depth Priors Enabling Novel View Synthesis [11.236094544193605]
Conventional geometry-based SLAM systems lack dense 3D reconstruction capabilities.
We propose a real-time RGB-D SLAM system that incorporates a novel view synthesis technique, 3D Gaussian Splatting.
arXiv Detail & Related papers (2024-08-10T21:23:08Z) - PRTGaussian: Efficient Relighting Using 3D Gaussians with Precomputed Radiance Transfer [13.869132334647771]
PRTGaussian is a realtime relightable novel-view synthesis method.
By fitting relightable Gaussians to multi-view OLAT data, our method enables real-time, free-viewpoint relighting.
arXiv Detail & Related papers (2024-08-10T20:57:38Z) - Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections [25.154665328053333]
We introduce Splatfacto-W, an in-trivial approach that integrates per-Gaussian neural color features and per-image appearance embeddings into an rendering process.
Our method improves the Peak Signal-to-Noise Ratio (PSNR) by an average of 5.3 dB compared to 3DGS, enhances training speed by 150 times compared to NeRF-based methods, and achieves a similar rendering speed to 3DGS.
arXiv Detail & Related papers (2024-07-17T04:02:54Z) - CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs [65.80187860906115]
We propose a novel approach to improve NeRF's performance with sparse inputs.
We first adopt a voxel-based ray sampling strategy to ensure that the sampled rays intersect with a certain voxel in 3D space.
We then randomly sample additional points within the voxel and apply a Transformer to infer the properties of other points on each ray, which are then incorporated into the volume rendering.
arXiv Detail & Related papers (2024-03-25T15:56:17Z) - Leveraging Neural Radiance Field in Descriptor Synthesis for Keypoints Scene Coordinate Regression [1.2974519529978974]
This paper introduces a pipeline for keypoint descriptor synthesis using Neural Radiance Field (NeRF)
generating novel poses and feeding them into a trained NeRF model to create new views, our approach enhances the KSCR's capabilities in data-scarce environments.
The proposed system could significantly improve localization accuracy by up to 50% and cost only a fraction of time for data synthesis.
arXiv Detail & Related papers (2024-03-15T13:40:37Z) - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z) - Differentiable Point-Based Radiance Fields for Efficient View Synthesis [57.56579501055479]
We propose a differentiable rendering algorithm for efficient novel view synthesis.
Our method is up to 300x faster than NeRF in both training and inference.
For dynamic scenes, our method trains two orders of magnitude faster than STNeRF and renders at near interactive rate.
arXiv Detail & Related papers (2022-05-28T04:36:13Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - Unpaired Single-Image Depth Synthesis with cycle-consistent Wasserstein
GANs [1.0499611180329802]
Real-time estimation of actual environment depth is an essential module for various autonomous system tasks.
In this study, latest advancements in the field of generative neural networks are leveraged to fully unsupervised single-image depth synthesis.
arXiv Detail & Related papers (2021-03-31T09:43:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.