Related papers: BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

URL: http://arxiv.org/abs/2303.14158v1
Date: Fri, 24 Mar 2023 17:13:49 GMT
Title: BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
Authors: Bowen Wen, Jonathan Tremblay, Valts Blukis, Stephen Tyree, Thomas Muller, Alex Evans, Dieter Fox, Jan Kautz, Stan Birchfield
Abstract summary: We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence. Our method works for arbitrary rigid objects, even when visual texture is largely absent.
Score: 89.2314092102403
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object. Our method works for arbitrary rigid objects, even when visual texture is largely absent. The object is assumed to be segmented in the first frame only. No additional information is required, and no assumption is made about the interaction agent. Key to our method is a Neural Object Field that is learned concurrently with a pose graph optimization process in order to robustly accumulate information into a consistent 3D representation capturing both geometry and appearance. A dynamic pool of posed memory frames is automatically maintained to facilitate communication between these threads. Our approach handles challenging sequences with large pose changes, partial and full occlusion, untextured surfaces, and specular highlights. We show results on HO3D, YCBInEOAT, and BEHAVE datasets, demonstrating that our method significantly outperforms existing approaches. Project page: https://bundlesdf.github.io

Related papers

Inverse Neural Rendering for Explainable Multi-Object Tracking [35.072142773300655]
We recast 3D multi-object tracking from RGB cameras as an emphInverse Rendering (IR) problem. We optimize an image loss over generative latent spaces that inherently disentangle shape and appearance properties. We validate the generalization and scaling capabilities of our method by learning the generative prior exclusively from synthetic data.
arXiv Detail & Related papers (2024-04-18T17:37:53Z)
Tracking by 3D Model Estimation of Unknown Objects in Videos [122.56499878291916]
We argue that this representation is limited and instead propose to guide and improve 2D tracking with an explicit object representation. Our representation tackles a complex long-term dense correspondence problem between all 3D points on the object for all video frames. The proposed optimization minimizes a novel loss function to estimate the best 3D shape, texture, and 6DoF pose.
arXiv Detail & Related papers (2023-04-13T11:32:36Z)
Visibility Aware Human-Object Interaction Tracking from Single RGB Camera [40.817960406002506]
We propose a novel method to track the 3D human, object, contacts between them, and their relative translation across frames from a single RGB camera. We condition our neural field reconstructions for human and object on per-frame SMPL model estimates obtained by pre-fitting SMPL to a video sequence. Human and object motion from visible frames provides valuable information to infer the occluded object.
arXiv Detail & Related papers (2023-03-29T06:23:44Z)
3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem. We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image. Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z)
PIZZA: A Powerful Image-only Zero-Shot Zero-CAD Approach to 6 DoF Tracking [27.283648727847268]
We present a method for tracking the 6D motion of objects in RGB video sequences when neither the training images nor the 3D geometry of the objects are available. In contrast to previous works, our method can therefore consider unknown objects in open world instantly. Our results on challenging datasets are on par with previous works that require much more information.
arXiv Detail & Related papers (2022-09-15T19:55:13Z)
Neural Correspondence Field for Object Pose Estimation [67.96767010122633]
We propose a method for estimating the 6DoF pose of a rigid object with an available 3D model from a single RGB image. Unlike classical correspondence-based methods which predict 3D object coordinates at pixels of the input image, the proposed method predicts 3D object coordinates at 3D query points sampled in the camera frustum.
arXiv Detail & Related papers (2022-07-30T01:48:23Z)
ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization [40.36229450208817]
We present ShAPO, a method for joint multi-object detection, 3D textured reconstruction, 6D object pose and size estimation. Key to ShAPO is a single-shot pipeline to regress shape, appearance and pose latent codes along with the masks of each object instance. Our method significantly out-performs all baselines on the NOCS dataset with an 8% absolute improvement in mAP for 6D pose estimation.
arXiv Detail & Related papers (2022-07-27T17:59:31Z)
Point2Seq: Detecting 3D Objects as Sequences [58.63662049729309]
We present a simple and effective framework, named Point2Seq, for 3D object detection from point clouds. We view each 3D object as a sequence of words and reformulate the 3D object detection task as decoding words from 3D scenes in an auto-regressive manner.
arXiv Detail & Related papers (2022-03-25T00:20:31Z)
RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets. Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications. In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z)
Neural Articulated Radiance Field [90.91714894044253]
We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images. Experiments show that the proposed method is efficient and can generalize well to novel poses.
arXiv Detail & Related papers (2021-04-07T13:23:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.