Related papers: Next-best-view Regression using a 3D Convolutional Neural Network

Next-best-view Regression using a 3D Convolutional Neural Network

URL: http://arxiv.org/abs/2101.09397v1
Date: Sat, 23 Jan 2021 01:50:26 GMT
Title: Next-best-view Regression using a 3D Convolutional Neural Network
Authors: J. Irving Vasquez-Gomez and David Troncoso and Israel Becerra and Enrique Sucar and Rafael Murrieta-Cid
Abstract summary: We propose a data-driven approach to address the next-best-view problem. The proposed approach trains a 3D convolutional neural network with previous reconstructions in order to regress the btxtposition of the next-best-view. We have validated the proposed approach making use of two groups of experiments.
Score: 0.9449650062296823
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated three-dimensional (3D) object reconstruction is the task of building a geometric representation of a physical object by means of sensing its surface. Even though new single view reconstruction techniques can predict the surface, they lead to incomplete models, specially, for non commons objects such as antique objects or art sculptures. Therefore, to achieve the task's goals, it is essential to automatically determine the locations where the sensor will be placed so that the surface will be completely observed. This problem is known as the next-best-view problem. In this paper, we propose a data-driven approach to address the problem. The proposed approach trains a 3D convolutional neural network (3D CNN) with previous reconstructions in order to regress the \btxt{position of the} next-best-view. To the best of our knowledge, this is one of the first works that directly infers the next-best-view in a continuous space using a data-driven approach for the 3D object reconstruction task. We have validated the proposed approach making use of two groups of experiments. In the first group, several variants of the proposed architecture are analyzed. Predicted next-best-views were observed to be closely positioned to the ground truth. In the second group of experiments, the proposed approach is requested to reconstruct several unseen objects, namely, objects not considered by the 3D CNN during training nor validation. Coverage percentages of up to 90 \% were observed. With respect to current state-of-the-art methods, the proposed approach improves the performance of previous next-best-view classification approaches and it is quite fast in running time (3 frames per second), given that it does not compute the expensive ray tracing required by previous information metrics.

Related papers

Online 3D Scene Reconstruction Using Neural Object Priors [83.14204014687938]
This paper addresses the problem of reconstructing a scene online at the level of objects given an RGB-D video sequence. We propose a feature grid mechanism to continuously update object-centric neural implicit representations as new object parts are revealed. Our approach outperforms state-of-the-art neural implicit models for this task in terms of reconstruction accuracy and completeness.
arXiv Detail & Related papers (2025-03-24T17:09:36Z)
Uncertainty Guided Policy for Active Robotic 3D Reconstruction using Neural Radiance Fields [82.21033337949757]
This paper introduces a ray-based volumetric uncertainty estimator, which computes the entropy of the weight distribution of the color samples along each ray of the object's implicit neural representation. We show that it is possible to infer the uncertainty of the underlying 3D geometry given a novel view with the proposed estimator. We present a next-best-view selection policy guided by the ray-based volumetric uncertainty in neural radiance fields-based representations.
arXiv Detail & Related papers (2022-09-17T21:28:57Z)
Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images. This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories. We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z)
NeurAR: Neural Uncertainty for Autonomous 3D Reconstruction [64.36535692191343]
Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems. This paper addresses two key challenges: 1) seeking a criterion to measure the quality of the candidate viewpoints for the view planning based on the new representations, and 2) learning the criterion from data that can generalize to different scenes instead of hand-crafting one. Our method demonstrates significant improvements on various metrics for the rendered image quality and the geometry quality of the reconstructed 3D models when compared with variants using TSDF or reconstruction without view planning.
arXiv Detail & Related papers (2022-07-22T10:05:36Z)
Reconstruct from Top View: A 3D Lane Detection Approach based on Geometry Structure Prior [19.1954119672487]
We propose an advanced approach in targeting the problem of monocular 3D lane detection by leveraging geometry structure underneath process of 2D to 3D lane reconstruction. We first analyze the geometry between the 3D lane and its 2D representation on the ground and propose to impose explicit supervision based on the structure prior. Second, to reduce the structure loss in 2D lane representation, we directly extract top view lane information from front view images.
arXiv Detail & Related papers (2022-06-21T04:03:03Z)
Learnable Triangulation for Deep Learning-based 3D Reconstruction of Objects of Arbitrary Topology from Single RGB Images [12.693545159861857]
We propose a novel deep reinforcement learning-based approach for 3D object reconstruction from monocular images. The proposed method outperforms the state-of-the-art in terms of visual quality, reconstruction accuracy, and computational time.
arXiv Detail & Related papers (2021-09-24T09:44:22Z)
RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets. Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications. In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z)
H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction [27.66008315400462]
Recent learning approaches that implicitly represent surface geometry have shown impressive results in the problem of multi-view 3D reconstruction. We tackle these limitations for the specific problem of few-shot full 3D head reconstruction. We learn a shape model of 3D heads from thousands of incomplete raw scans using implicit representations.
arXiv Detail & Related papers (2021-07-26T23:04:18Z)
Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space. Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z)
From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes. The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z)
Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training [3.8073142980733]
We propose a novel framework for monocular 3D objects detection using only RGB images, called KM3D-Net. We design a fully convolutional model to predict object keypoints, dimension, and orientation, and then combine these estimations with perspective geometry constraints to compute position attribute.
arXiv Detail & Related papers (2020-09-02T00:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.