CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
- URL: http://arxiv.org/abs/2403.10760v1
- Date: Sat, 16 Mar 2024 01:47:53 GMT
- Title: CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
- Authors: Yoonyoung Cho, Junhyek Han, Yoontae Cho, Beomjoon Kim,
- Abstract summary: Nonprehensile manipulation is essential for manipulating objects that are too thin, large, or otherwise ungraspable in the wild.
We propose a novel contact-based object representation and pretraining pipeline to tackle this.
- Score: 1.3299507495084417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nonprehensile manipulation is essential for manipulating objects that are too thin, large, or otherwise ungraspable in the wild. To sidestep the difficulty of contact modeling in conventional modeling-based approaches, reinforcement learning (RL) has recently emerged as a promising alternative. However, previous RL approaches either lack the ability to generalize over diverse object shapes, or use simple action primitives that limit the diversity of robot motions. Furthermore, using RL over diverse object geometry is challenging due to the high cost of training a policy that takes in high-dimensional sensory inputs. We propose a novel contact-based object representation and pretraining pipeline to tackle this. To enable massively parallel training, we leverage a lightweight patch-based transformer architecture for our encoder that processes point clouds, thus scaling our training across thousands of environments. Compared to learning from scratch, or other shape representation baselines, our representation facilitates both time- and data-efficient learning. We validate the efficacy of our overall system by zero-shot transferring the trained policy to novel real-world objects. Code and videos are available at https://sites.google.com/view/contact-non-prehensile.
Related papers
- Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange [50.45953583802282]
We introduce a novel self-supervised learning (SSL) strategy for point cloud scene understanding.
Our approach leverages both object patterns and contextual cues to produce robust features.
Our experiments demonstrate the superiority of our method over existing SSL techniques.
arXiv Detail & Related papers (2024-04-11T06:39:53Z) - Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects [18.342569823885864]
Teacher-Augmented Policy Gradient (TAPG) is a novel two-stage learning framework that synergizes reinforcement learning and policy distillation.
TAPG facilitates guided, yet adaptive, learning of a sensorimotor policy, based on object segmentation.
Our trained policies adeptly grasp a wide variety of objects from cluttered scenarios in simulation and the real world based on human-understandable prompts.
arXiv Detail & Related papers (2024-03-15T10:48:16Z) - Learning Extrinsic Dexterity with Parameterized Manipulation Primitives [8.7221770019454]
We learn a sequence of actions that utilize the environment to change the object's pose.
Our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment.
We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace.
arXiv Detail & Related papers (2023-10-26T21:28:23Z) - REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.
Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.
Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - Efficient Representations of Object Geometry for Reinforcement Learning
of Interactive Grasping Policies [29.998917158604694]
We present a reinforcement learning framework that learns the interactive grasping of various geometrically distinct real-world objects.
Videos of learned interactive policies are available at https://maltemosbach.org/io/geometry_aware_grasping_policies.
arXiv Detail & Related papers (2022-11-20T11:47:33Z) - Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way.
Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes [29.49728031012592]
We study the problem of robotic stacking with objects of complex geometry.
We propose a challenging and diverse set of objects that was carefully designed to require strategies beyond a simple "pick-and-place" solution.
Our method is a reinforcement learning (RL) approach combined with vision-based interactive policy distillation and simulation-to-reality transfer.
arXiv Detail & Related papers (2021-10-12T17:46:06Z) - COCOI: Contact-aware Online Context Inference for Generalizable
Non-planar Pushing [87.7257446869134]
General contact-rich manipulation problems are long-standing challenges in robotics.
Deep reinforcement learning has shown great potential in solving robot manipulation tasks.
We propose COCOI, a deep RL method that encodes a context embedding of dynamics properties online.
arXiv Detail & Related papers (2020-11-23T08:20:21Z) - Learning Rope Manipulation Policies Using Dense Object Descriptors
Trained on Synthetic Depth Data [32.936908766549344]
We present an approach that learns point-pair correspondences between initial and goal rope configurations.
In 50 trials of a knot-tying task with the ABB YuMi Robot, the system achieves a 66% knot-tying success rate from previously unseen configurations.
arXiv Detail & Related papers (2020-03-03T23:43:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.