Ternary-type Opacity and Hybrid Odometry for RGB-only NeRF-SLAM
- URL: http://arxiv.org/abs/2312.13332v2
- Date: Fri, 22 Dec 2023 18:32:55 GMT
- Title: Ternary-type Opacity and Hybrid Odometry for RGB-only NeRF-SLAM
- Authors: Junru Lin, Asen Nachkov, Songyou Peng, Luc Van Gool, Danda Pani Paudel
- Abstract summary: We study why ternary-type opacity is well-suited and desired for the task at hand.
We propose a simple yet novel visual odometry scheme that uses a hybrid combination of volumetric and warping-based image renderings.
- Score: 62.23809541385653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The opacity of rigid 3D scenes with opaque surfaces is considered to be of a
binary type. However, we observed that this property is not followed by the
existing RGB-only NeRF-SLAM. Therefore, we are motivated to introduce this
prior into the RGB-only NeRF-SLAM pipeline. Unfortunately, the optimization
through the volumetric rendering function does not facilitate easy integration
of the desired prior. Instead, we observed that the opacity of ternary-type
(TT) is well supported. In this work, we study why ternary-type opacity is
well-suited and desired for the task at hand. In particular, we provide
theoretical insights into the process of jointly optimizing radiance and
opacity through the volumetric rendering process. Through exhaustive
experiments on benchmark datasets, we validate our claim and provide insights
into the optimization process, which we believe will unleash the potential of
RGB-only NeRF-SLAM. To foster this line of research, we also propose a simple
yet novel visual odometry scheme that uses a hybrid combination of volumetric
and warping-based image renderings. More specifically, the proposed hybrid
odometry (HO) additionally uses image warping-based coarse odometry, leading up
to an order of magnitude final speed-up. Furthermore, we show that the proposed
TT and HO well complement each other, offering state-of-the-art results on
benchmark datasets in terms of both speed and accuracy.
Related papers
- UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning [17.22733823085519]
We propose a scalable and efficient framework called UniRGB-IR to unify RGB-IR downstream tasks.
Our framework consists of a transformer (ViT) foundation model, a Multi-modal Feature Pool (SFI) module and a Supplementary Feature (SFI) module.
Experimental results on various RGB-IR downstream tasks demonstrate that our method can achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-04-26T12:21:57Z) - GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM [53.6402869027093]
We propose an efficient RGB-only dense SLAM system using a flexible neural point cloud representation scene.
We also introduce a novel DSPO layer for bundle adjustment which optimize the pose and depth of implicits along with the scale of the monocular depth.
arXiv Detail & Related papers (2024-03-28T16:32:06Z) - Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM [6.242958695705305]
Implicit neural representation (INR) in combination with geometric rendering has been employed in real-time dense RGB-D SLAM.
We establish the first open-source benchmark framework to evaluate the performance of a wide spectrum of commonly used INRs and rendering functions.
We propose explicit hybrid encoding for high-fidelity dense grid mapping to comply with the RGB-D SLAM system.
arXiv Detail & Related papers (2024-03-28T14:59:56Z) - CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs [65.80187860906115]
We propose a novel approach to improve NeRF's performance with sparse inputs.
We first adopt a voxel-based ray sampling strategy to ensure that the sampled rays intersect with a certain voxel in 3D space.
We then randomly sample additional points within the voxel and apply a Transformer to infer the properties of other points on each ray, which are then incorporated into the volume rendering.
arXiv Detail & Related papers (2024-03-25T15:56:17Z) - Attentive Multimodal Fusion for Optical and Scene Flow [24.08052492109655]
Existing methods typically rely solely on RGB images or fuse the modalities at later stages.
We propose a novel deep neural network approach named FusionRAFT, which enables early-stage information fusion between sensor modalities.
Our approach exhibits improved robustness in the presence of noise and low-lighting conditions that affect the RGB images.
arXiv Detail & Related papers (2023-07-28T04:36:07Z) - A Combined Approach Toward Consistent Reconstructions of Indoor Spaces
Based on 6D RGB-D Odometry and KinectFusion [7.503338065129185]
We propose a 6D RGB-D odometry approach that finds the relative camera pose between consecutive RGB-D frames by keypoint extraction.
We feed the estimated pose to the highly accurate KinectFusion algorithm, which fine-tune the frame-to-frame relative pose.
Our algorithm outputs a ready-to-use polygon mesh (highly suitable for creating 3D virtual worlds) without any postprocessing steps.
arXiv Detail & Related papers (2022-12-25T22:52:25Z) - NeRF in detail: Learning to sample for view synthesis [104.75126790300735]
Neural radiance fields (NeRF) methods have demonstrated impressive novel view synthesis.
In this work we address a clear limitation of the vanilla coarse-to-fine approach -- that it is based on a performance and not trained end-to-end for the task at hand.
We introduce a differentiable module that learns to propose samples and their importance for the fine network, and consider and compare multiple alternatives for its neural architecture.
arXiv Detail & Related papers (2021-06-09T17:59:10Z) - Neural BRDF Representation and Importance Sampling [79.84316447473873]
We present a compact neural network-based representation of reflectance BRDF data.
We encode BRDFs as lightweight networks, and propose a training scheme with adaptive angular sampling.
We evaluate encoding results on isotropic and anisotropic BRDFs from multiple real-world datasets.
arXiv Detail & Related papers (2021-02-11T12:00:24Z) - Fast Hyperspectral Image Recovery via Non-iterative Fusion of
Dual-Camera Compressive Hyperspectral Imaging [22.683482662362337]
Coded aperture snapshot spectral imaging (CASSI) is a promising technique to capture the three-dimensional hyperspectral image (HSI)
Various regularizers have been exploited to reconstruct the 3D data from the 2D measurement.
One feasible solution is to utilize additional information such as the RGB measurement in CASSI.
arXiv Detail & Related papers (2020-12-30T10:29:32Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.