Learning to Infer Kinematic Hierarchies for Novel Object Instances
- URL: http://arxiv.org/abs/2110.07911v1
- Date: Fri, 15 Oct 2021 07:50:48 GMT
- Title: Learning to Infer Kinematic Hierarchies for Novel Object Instances
- Authors: Hameed Abdul-Rashid, Miles Freeman, Ben Abbatematteo, George
Konidaris, Daniel Ritchie
- Abstract summary: Our system infers the moving parts of an object and the kinematic couplings that relate them.
We evaluate our system on simulated scans of 3D objects, andwe demonstrate a proof-of-concept use of our system to drivereal-world robotic manipulation.
- Score: 12.50766181856788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Manipulating an articulated object requires perceiving itskinematic
hierarchy: its parts, how each can move, and howthose motions are coupled.
Previous work has explored per-ception for kinematics, but none infers a
complete kinematichierarchy on never-before-seen object instances, without
relyingon a schema or template. We present a novel perception systemthat
achieves this goal. Our system infers the moving parts ofan object and the
kinematic couplings that relate them. Toinfer parts, it uses a point cloud
instance segmentation neuralnetwork and to infer kinematic hierarchies, it uses
a graphneural network to predict the existence, direction, and typeof edges
(i.e. joints) that relate the inferred parts. We trainthese networks using
simulated scans of synthetic 3D models.We evaluate our system on simulated
scans of 3D objects, andwe demonstrate a proof-of-concept use of our system to
drivereal-world robotic manipulation.
Related papers
- SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR.
SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds.
We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z) - Kinematic-aware Prompting for Generalizable Articulated Object
Manipulation with LLMs [53.66070434419739]
Generalizable articulated object manipulation is essential for home-assistant robots.
We propose a kinematic-aware prompting framework that prompts Large Language Models with kinematic knowledge of objects to generate low-level motion waypoints.
Our framework outperforms traditional methods on 8 categories seen and shows a powerful zero-shot capability for 8 unseen articulated object categories.
arXiv Detail & Related papers (2023-11-06T03:26:41Z) - NAP: Neural 3D Articulation Prior [31.875925637190328]
We propose Neural 3D Articulation Prior (NAP), the first 3D deep generative model to synthesize 3D articulated object models.
To generate articulated objects, we first design a novel articulation tree/graph parameterization and then apply a diffusion-denoising probabilistic model over this representation.
In order to capture both the geometry and the motion structure whose distribution will affect each other, we design a graph-attention denoising network for learning the reverse diffusion process.
arXiv Detail & Related papers (2023-05-25T17:59:35Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - CA$^2$T-Net: Category-Agnostic 3D Articulation Transfer from Single
Image [41.70960551470232]
We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i.e., unarticulated) 3D model.
Our network learns to predict the object's pose, part segmentation, and corresponding motion parameters to reproduce the articulation shown in the input image.
arXiv Detail & Related papers (2023-01-05T18:57:12Z) - Unsupervised Kinematic Motion Detection for Part-segmented 3D Shape
Collections [14.899075941080541]
We present an unsupervised approach for discovering articulated motions in a part-segmented 3D shape collection.
Our approach is based on a concept we call category closure: any valid articulation of an object's parts should keep the object in the same semantic category.
We evaluate our approach by using it to re-discover part motions from the PartNet-Mobility dataset.
arXiv Detail & Related papers (2022-06-17T00:50:36Z) - FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects [14.034256001448574]
We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects.
We deploy an analytical motion planner based on this vector field to achieve a policy that yields maximum articulation.
Results show that our system achieves state-of-the-art performance in both simulated and real-world experiments.
arXiv Detail & Related papers (2022-05-09T15:35:33Z) - Towards 3D Scene Understanding by Referring Synthetic Models [65.74211112607315]
Methods typically alleviate on-extensive annotations on real scene scans.
We explore how synthetic models rely on real scene categories of synthetic features to a unified feature space.
Experiments show that our method achieves the average mAP of 46.08% on the ScanNet S3DIS dataset and 55.49% by learning datasets.
arXiv Detail & Related papers (2022-03-20T13:06:15Z) - Learning Multi-Object Dynamics with Compositional Neural Radiance Fields [63.424469458529906]
We present a method to learn compositional predictive models from image observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and graph neural networks.
NeRFs have become a popular choice for representing scenes due to their strong 3D prior.
For planning, we utilize RRTs in the learned latent space, where we can exploit our model and the implicit object encoder to make sampling the latent space informative and more efficient.
arXiv Detail & Related papers (2022-02-24T01:31:29Z) - 3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations.
A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z) - "What's This?" -- Learning to Segment Unknown Objects from Manipulation
Sequences [27.915309216800125]
We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator.
We propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge.
Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data.
arXiv Detail & Related papers (2020-11-06T10:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.