Amodal 3D Reconstruction for Robotic Manipulation via Stability and
Connectivity
- URL: http://arxiv.org/abs/2009.13146v1
- Date: Mon, 28 Sep 2020 08:52:54 GMT
- Title: Amodal 3D Reconstruction for Robotic Manipulation via Stability and
Connectivity
- Authors: William Agnew, Christopher Xie, Aaron Walsman, Octavian Murad, Caelen
Wang, Pedro Domingos, Siddhartha Srinivasa
- Abstract summary: Learning-based 3D object reconstruction enables single- or few-shot estimation of 3D object models.
Existing 3D reconstruction techniques optimize for visual reconstruction fidelity, typically measured by chamfer distance or voxel IOU.
We propose ARM, an amodal 3D reconstruction system that introduces a stability prior over object shapes, (2) a connectivity prior, and (3) a multi-channel input representation.
- Score: 3.359622001455893
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning-based 3D object reconstruction enables single- or few-shot
estimation of 3D object models. For robotics, this holds the potential to allow
model-based methods to rapidly adapt to novel objects and scenes. Existing 3D
reconstruction techniques optimize for visual reconstruction fidelity,
typically measured by chamfer distance or voxel IOU. We find that when applied
to realistic, cluttered robotics environments, these systems produce
reconstructions with low physical realism, resulting in poor task performance
when used for model-based control. We propose ARM, an amodal 3D reconstruction
system that introduces (1) a stability prior over object shapes, (2) a
connectivity prior, and (3) a multi-channel input representation that allows
for reasoning over relationships between groups of objects. By using these
priors over the physical properties of objects, our system improves
reconstruction quality not just by standard visual metrics, but also
performance of model-based control on a variety of robotics manipulation tasks
in challenging, cluttered environments. Code is available at
github.com/wagnew3/ARM.
Related papers
- Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model [35.184607650708784]
Articulate-Anything automates the articulation of diverse, complex objects from many input modalities, including text, images, and videos.
Our system exploits existing 3D asset datasets via a mesh retrieval mechanism, along with an actor-critic system that iteratively proposes, evaluates, and refines solutions.
arXiv Detail & Related papers (2024-10-03T19:42:16Z) - Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication [50.541882834405946]
We introduce Atlas3D, an automatic and easy-to-implement text-to-3D method.
Our approach combines a novel differentiable simulation-based loss function with physically inspired regularization.
We verify Atlas3D's efficacy through extensive generation tasks and validate the resulting 3D models in both simulated and real-world environments.
arXiv Detail & Related papers (2024-05-28T18:33:18Z) - Uncertainty-aware Active Learning of NeRF-based Object Models for Robot Manipulators using Visual and Re-orientation Actions [8.059133373836913]
This paper presents an approach that enables a robot to rapidly learn the complete 3D model of a given object for manipulation in unfamiliar orientations.
We use an ensemble of partially constructed NeRF models to quantify model uncertainty to determine the next action.
Our approach determines when and how to grasp and re-orient an object given its partial NeRF model and re-estimates the object pose to rectify misalignments introduced during the interaction.
arXiv Detail & Related papers (2024-04-02T10:15:06Z) - SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR.
SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds.
We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z) - Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors [15.34487368683311]
We propose a framework that can reconstruct high-quality object-level maps for unknown objects.
Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses for detected objects.
We derive a probabilistic formulation that propagates shape and pose uncertainty through two novel loss functions.
arXiv Detail & Related papers (2023-09-17T00:48:19Z) - Shape, Pose, and Appearance from a Single Image via Bootstrapped
Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available.
We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution.
Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z) - Fast-Image2Point: Towards Real-Time Point Cloud Reconstruction of a
Single Image using 3D Supervision [0.0]
A key question in the problem of 3D reconstruction is how to train a machine or a robot to model 3D objects.
This study addresses current problems in reconstructing objects displayed in a single-view image in a faster (real-time) fashion.
arXiv Detail & Related papers (2022-09-20T22:39:14Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - PaMIR: Parametric Model-Conditioned Implicit Representation for
Image-based Human Reconstruction [67.08350202974434]
We propose Parametric Model-Conditioned Implicit Representation (PaMIR), which combines the parametric body model with the free-form deep implicit function.
We show that our method achieves state-of-the-art performance for image-based 3D human reconstruction in the cases of challenging poses and clothing types.
arXiv Detail & Related papers (2020-07-08T02:26:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.