Related papers: CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation

CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation

URL: http://arxiv.org/abs/2304.09302v1
Date: Tue, 18 Apr 2023 21:09:55 GMT
Title: CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation
Authors: Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Adam Fishman, Dieter Fox
Abstract summary: We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation.
Score: 54.68738348071891
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene. Our representation has a fast inference speed of 7 microseconds per query with nearly 20% higher performance than baseline approaches in challenging environments. We use this collision model in conjunction with a Model Predictive Path Integral (MPPI) planner to generate collision-free trajectories for picking and placing in clutter. CabiNet also predicts waypoints, computed from the scene's signed distance field (SDF), that allows the robot to navigate tight spaces during rearrangement. This improves rearrangement performance by nearly 35% compared to baselines. We systematically evaluate our approach, procedurally generate simulated experiments, and demonstrate that our approach directly transfers to the real world, despite training exclusively in simulation. Robot experiment demos in completely unknown scenes and objects can be found at this http https://cabinet-object-rearrangement.github.io

Related papers

PickScan: Object discovery and reconstruction from handheld interactions [99.99566882133179]
We develop an interaction-guided and class-agnostic method to reconstruct 3D representations of scenes. Our main contribution is a novel approach to detecting user-object interactions and extracting the masks of manipulated objects. Compared to Co-Fusion, the only comparable interaction-based and class-agnostic baseline, this corresponds to a reduction in chamfer distance of 73%.
arXiv Detail & Related papers (2024-11-17T23:09:08Z)
Uncertainty-aware Active Learning of NeRF-based Object Models for Robot Manipulators using Visual and Re-orientation Actions [8.059133373836913]
This paper presents an approach that enables a robot to rapidly learn the complete 3D model of a given object for manipulation in unfamiliar orientations. We use an ensemble of partially constructed NeRF models to quantify model uncertainty to determine the next action. Our approach determines when and how to grasp and re-orient an object given its partial NeRF model and re-estimates the object pose to rectify misalignments introduced during the interaction.
arXiv Detail & Related papers (2024-04-02T10:15:06Z)
Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation. The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects. Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z)
COPILOT: Human-Environment Collision Prediction and Localization from Egocentric Videos [62.34712951567793]
The ability to forecast human-environment collisions from egocentric observations is vital to enable collision avoidance in applications such as VR, AR, and wearable assistive robotics. We introduce the challenging problem of predicting collisions in diverse environments from multi-view egocentric videos captured from body-mounted cameras. We propose a transformer-based model called COPILOT to perform collision prediction and localization simultaneously.
arXiv Detail & Related papers (2022-10-04T17:49:23Z)
iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [64.80458128766254]
iSDF is a continuous learning system for real-time signed distance field reconstruction. It produces more accurate reconstructions and better approximations of collision costs and gradients.
arXiv Detail & Related papers (2022-04-05T15:48:39Z)
PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds [4.381579507834533]
3D scene understanding from point clouds plays a vital role for various robotic applications. Current state-of-the-art methods use separate neural networks for different tasks like object detection or room layout estimation. We propose the first transformer architecture that predicts 3D objects and layouts simultaneously.
arXiv Detail & Related papers (2021-09-12T17:31:59Z)
SIMstack: A Generative Shape and Instance Model for Unordered Object Stacks [38.042876641457255]
We propose a depth-conditioned Variational Auto-Encoder (VAE) trained on a dataset of objects stacked under physics simulation. We formulate instance segmentation as a centre voting task which allows for class-agnostic detection and doesn't require setting the maximum number of objects in the scene. Our method has practical applications in providing robots some of the ability humans have to make rapid intuitive inferences of partially observed scenes.
arXiv Detail & Related papers (2021-03-30T15:42:43Z)
Object Rearrangement Using Learned Implicit Collision Functions [61.90305371998561]
We propose a learned collision model that accepts scene and query object point clouds and predicts collisions for 6DOF object poses within the scene. We leverage the learned collision model as part of a model predictive path integral (MPPI) policy in a tabletop rearrangement task. The learned model outperforms both traditional pipelines and learned ablations by 9.8% in accuracy on a dataset of simulated collision queries.
arXiv Detail & Related papers (2020-11-21T05:36:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.