A System for General In-Hand Object Re-Orientation
- URL: http://arxiv.org/abs/2111.03043v1
- Date: Thu, 4 Nov 2021 17:47:39 GMT
- Title: A System for General In-Hand Object Re-Orientation
- Authors: Tao Chen, Jie Xu, Pulkit Agrawal
- Abstract summary: We present a model-free framework that can learn to reorient objects with both the hand facing upwards and downwards.
We demonstrate the capability of reorienting over 2000 geometrically different objects in both cases.
- Score: 23.538271727475525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In-hand object reorientation has been a challenging problem in robotics due
to high dimensional actuation space and the frequent change in contact state
between the fingers and the objects. We present a simple model-free framework
that can learn to reorient objects with both the hand facing upwards and
downwards. We demonstrate the capability of reorienting over 2000 geometrically
different objects in both cases. The learned policies show strong zero-shot
transfer performance on new objects. We provide evidence that these policies
are amenable to real-world operation by distilling them to use observations
easily available in the real world. The videos of the learned policies are
available at: https://taochenshh.github.io/projects/in-hand-reorientation.
Related papers
- From Simple to Complex Skills: The Case of In-Hand Object Reorientation [45.58997623305503]
We introduce a hierarchical policy for in-hand object reorientation based on previously acquired rotation skills.
This hierarchical policy learns to select which low-level skill to execute based on feedback from both the environment and the low-level skill policies themselves.
We propose a generalizable object pose estimator that uses proprioceptive information, low-level skill predictions, and control errors as inputs to estimate the object pose over time.
arXiv Detail & Related papers (2025-01-09T18:49:39Z) - Interacted Object Grounding in Spatio-Temporal Human-Object Interactions [70.8859442754261]
We introduce a new open-world benchmark: Grounding Interacted Objects (GIO)
An object grounding task is proposed expecting vision systems to discover interacted objects.
We propose a 4D question-answering framework (4D-QA) to discover interacted objects from diverse videos.
arXiv Detail & Related papers (2024-12-27T09:08:46Z) - Lessons from Learning to Spin "Pens" [51.9182692233916]
In this work, we push the boundaries of learning-based in-hand manipulation systems by demonstrating the capability to spin pen-like objects.
We first use reinforcement learning to train an oracle policy with privileged information and generate a high-fidelity trajectory dataset in simulation.
We then fine-tune the sensorimotor policy using these real-world trajectories to adapt it to the real world dynamics.
arXiv Detail & Related papers (2024-07-26T17:56:01Z) - Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects [18.342569823885864]
Teacher-Augmented Policy Gradient (TAPG) is a novel two-stage learning framework that synergizes reinforcement learning and policy distillation.
TAPG facilitates guided, yet adaptive, learning of a sensorimotor policy, based on object segmentation.
Our trained policies adeptly grasp a wide variety of objects from cluttered scenarios in simulation and the real world based on human-understandable prompts.
arXiv Detail & Related papers (2024-03-15T10:48:16Z) - Learning Generalizable Manipulation Policies with Object-Centric 3D
Representations [65.55352131167213]
GROOT is an imitation learning method for learning robust policies with object-centric and 3D priors.
It builds policies that generalize beyond their initial training conditions for vision-based manipulation.
GROOT's performance excels in generalization over background changes, camera viewpoint shifts, and the presence of new object instances.
arXiv Detail & Related papers (2023-10-22T18:51:45Z) - Learning Explicit Contact for Implicit Reconstruction of Hand-held
Objects from Monocular Images [59.49985837246644]
We show how to model contacts in an explicit way to benefit the implicit reconstruction of hand-held objects.
In the first part, we propose a new subtask of directly estimating 3D hand-object contacts from a single image.
In the second part, we introduce a novel method to diffuse estimated contact states from the hand mesh surface to nearby 3D space.
arXiv Detail & Related papers (2023-05-31T17:59:26Z) - Visual Dexterity: In-Hand Reorientation of Novel and Complex Object
Shapes [31.05016510558315]
In-hand object reorientation is necessary for performing many dexterous manipulation tasks.
We present a general object reorientation controller that does not make these assumptions.
The controller is trained using reinforcement learning in simulation and evaluated in the real world on new object shapes.
arXiv Detail & Related papers (2022-11-21T18:59:33Z) - Efficient Representations of Object Geometry for Reinforcement Learning
of Interactive Grasping Policies [29.998917158604694]
We present a reinforcement learning framework that learns the interactive grasping of various geometrically distinct real-world objects.
Videos of learned interactive policies are available at https://maltemosbach.org/io/geometry_aware_grasping_policies.
arXiv Detail & Related papers (2022-11-20T11:47:33Z) - Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task
Learning [108.08083976908195]
We show that policies learned by existing reinforcement learning algorithms can in fact be generalist.
We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects.
Interestingly, we find that multi-task learning with object point cloud representations not only generalizes better but even outperforms single-object specialist policies.
arXiv Detail & Related papers (2021-11-04T17:59:56Z) - Generalization Through Hand-Eye Coordination: An Action Space for
Learning Spatially-Invariant Visuomotor Control [67.23580984118479]
Imitation Learning (IL) is an effective framework to learn visuomotor skills from offline demonstration data.
Hand-eye Action Networks (HAN) can approximate human's hand-eye coordination behaviors by learning from human teleoperated demonstrations.
arXiv Detail & Related papers (2021-02-28T01:49:13Z) - Reactive Human-to-Robot Handovers of Arbitrary Objects [57.845894608577495]
We present a vision-based system that enables human-to-robot handovers of unknown objects.
Our approach combines closed-loop motion planning with real-time, temporally-consistent grasp generation.
We demonstrate the generalizability, usability, and robustness of our approach on a novel benchmark set of 26 diverse household objects.
arXiv Detail & Related papers (2020-11-17T21:52:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.