Related papers: Uncertainty Guided Policy for Active Robotic 3D Reconstruction using Neural Radiance Fields

Uncertainty Guided Policy for Active Robotic 3D Reconstruction using Neural Radiance Fields

URL: http://arxiv.org/abs/2209.08409v1
Date: Sat, 17 Sep 2022 21:28:57 GMT
Title: Uncertainty Guided Policy for Active Robotic 3D Reconstruction using Neural Radiance Fields
Authors: Soomin Lee, Le Chen, Jiahao Wang, Alexander Liniger, Suryansh Kumar, Fisher Yu
Abstract summary: This paper introduces a ray-based volumetric uncertainty estimator, which computes the entropy of the weight distribution of the color samples along each ray of the object's implicit neural representation. We show that it is possible to infer the uncertainty of the underlying 3D geometry given a novel view with the proposed estimator. We present a next-best-view selection policy guided by the ray-based volumetric uncertainty in neural radiance fields-based representations.
Score: 82.21033337949757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we tackle the problem of active robotic 3D reconstruction of an object. In particular, we study how a mobile robot with an arm-held camera can select a favorable number of views to recover an object's 3D shape efficiently. Contrary to the existing solution to this problem, we leverage the popular neural radiance fields-based object representation, which has recently shown impressive results for various computer vision tasks. However, it is not straightforward to directly reason about an object's explicit 3D geometric details using such a representation, making the next-best-view selection problem for dense 3D reconstruction challenging. This paper introduces a ray-based volumetric uncertainty estimator, which computes the entropy of the weight distribution of the color samples along each ray of the object's implicit neural representation. We show that it is possible to infer the uncertainty of the underlying 3D geometry given a novel view with the proposed estimator. We then present a next-best-view selection policy guided by the ray-based volumetric uncertainty in neural radiance fields-based representations. Encouraging experimental results on synthetic and real-world data suggest that the approach presented in this paper can enable a new research direction of using an implicit 3D object representation for the next-best-view problem in robot vision applications, distinguishing our approach from the existing approaches that rely on explicit 3D geometric modeling.

Related papers

Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation [66.3814684757376]
This work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level. The outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.
arXiv Detail & Related papers (2024-03-21T10:38:18Z)
Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors [15.34487368683311]
We propose a framework that can reconstruct high-quality object-level maps for unknown objects. Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses for detected objects. We derive a probabilistic formulation that propagates shape and pose uncertainty through two novel loss functions.
arXiv Detail & Related papers (2023-09-17T00:48:19Z)
LIST: Learning Implicitly from Spatial Transformers for Single-View 3D Reconstruction [5.107705550575662]
List is a novel neural architecture that leverages local and global image features to reconstruct geometric and topological structure of a 3D object from a single image. We show the superiority of our model in reconstructing 3D objects from both synthetic and real-world images against the state of the art.
arXiv Detail & Related papers (2023-07-23T01:01:27Z)
3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation [50.15926681475939]
Inverse graphics aims to infer the 3D scene structure from 2D images. We introduce probabilistic modeling to quantify uncertainty and achieve robustness in 6D pose estimation tasks. 3DNEL effectively combines learned neural embeddings from RGB with depth information to improve robustness in sim-to-real 6D object pose estimation from RGB-D images.
arXiv Detail & Related papers (2023-02-07T20:48:35Z)
Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available. We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution. Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z)
2D GANs Meet Unsupervised Single-view 3D Reconstruction [21.93671761497348]
controllable image generation based on pre-trained GANs can benefit a wide range of computer vision tasks. We propose a novel image-conditioned neural implicit field, which can leverage 2D supervisions from GAN-generated multi-view images. The effectiveness of our approach is demonstrated through superior single-view 3D reconstruction results of generic objects.
arXiv Detail & Related papers (2022-07-20T20:24:07Z)
Seeing by haptic glance: reinforcement learning-based 3D object Recognition [31.80213713136647]
Human is able to conduct 3D recognition by a limited number of haptic contacts between the target object and his/her fingers without seeing the object. This capability is defined as haptic glance' in cognitive neuroscience. Most of the existing 3D recognition models were developed based on dense 3D data. In many real-life use cases, where robots are used to collect 3D data by haptic exploration, only a limited number of 3D points could be collected. A novel reinforcement learning based framework is proposed, where the haptic exploration procedure is optimized simultaneously with the objective 3D recognition with actively collected 3D
arXiv Detail & Related papers (2021-02-15T15:38:22Z)
Next-best-view Regression using a 3D Convolutional Neural Network [0.9449650062296823]
We propose a data-driven approach to address the next-best-view problem. The proposed approach trains a 3D convolutional neural network with previous reconstructions in order to regress the btxtposition of the next-best-view. We have validated the proposed approach making use of two groups of experiments.
arXiv Detail & Related papers (2021-01-23T01:50:26Z)
From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes. The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z)
Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image. Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space. We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step. This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.