Related papers: Visual Dexterity: In-Hand Reorientation of Novel and Complex Object Shapes

Visual Dexterity: In-Hand Reorientation of Novel and Complex Object Shapes

URL: http://arxiv.org/abs/2211.11744v3
Date: Fri, 24 Nov 2023 18:53:31 GMT
Title: Visual Dexterity: In-Hand Reorientation of Novel and Complex Object Shapes
Authors: Tao Chen, Megha Tippur, Siyang Wu, Vikash Kumar, Edward Adelson, Pulkit Agrawal
Abstract summary: In-hand object reorientation is necessary for performing many dexterous manipulation tasks. We present a general object reorientation controller that does not make these assumptions. The controller is trained using reinforcement learning in simulation and evaluated in the real world on new object shapes.
Score: 31.05016510558315
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In-hand object reorientation is necessary for performing many dexterous manipulation tasks, such as tool use in less structured environments that remain beyond the reach of current robots. Prior works built reorientation systems assuming one or many of the following: reorienting only specific objects with simple shapes, limited range of reorientation, slow or quasistatic manipulation, simulation-only results, the need for specialized and costly sensor suites, and other constraints which make the system infeasible for real-world deployment. We present a general object reorientation controller that does not make these assumptions. It uses readings from a single commodity depth camera to dynamically reorient complex and new object shapes by any rotation in real-time, with the median reorientation time being close to seven seconds. The controller is trained using reinforcement learning in simulation and evaluated in the real world on new object shapes not used for training, including the most challenging scenario of reorienting objects held in the air by a downward-facing hand that must counteract gravity during reorientation. Our hardware platform only uses open-source components that cost less than five thousand dollars. Although we demonstrate the ability to overcome assumptions in prior work, there is ample scope for improving absolute performance. For instance, the challenging duck-shaped object not used for training was dropped in 56 percent of the trials. When it was not dropped, our controller reoriented the object within 0.4 radians (23 degrees) 75 percent of the time. Videos are available at: https://taochenshh.github.io/projects/visual-dexterity.

Related papers

Uncertainty-aware Active Learning of NeRF-based Object Models for Robot Manipulators using Visual and Re-orientation Actions [8.059133373836913]
This paper presents an approach that enables a robot to rapidly learn the complete 3D model of a given object for manipulation in unfamiliar orientations. We use an ensemble of partially constructed NeRF models to quantify model uncertainty to determine the next action. Our approach determines when and how to grasp and re-orient an object given its partial NeRF model and re-estimates the object pose to rectify misalignments introduced during the interaction.
arXiv Detail & Related papers (2024-04-02T10:15:06Z)
Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation [57.60490773016364]
We combine vision and touch sensing on a multi-fingered hand to estimate an object's pose and shape during in-hand manipulation. Our method, NeuralFeels, encodes object geometry by learning a neural field online and jointly tracks it by optimizing a pose graph problem. Our results demonstrate that touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation.
arXiv Detail & Related papers (2023-12-20T22:36:37Z)
CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation [54.68738348071891]
We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation.
arXiv Detail & Related papers (2023-04-18T21:09:55Z)
DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem. DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden. It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z)
A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for Mobile Devices [3.4836209951879957]
We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices. Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side. Our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices.
arXiv Detail & Related papers (2022-10-22T15:26:50Z)
Stable Object Reorientation using Contact Plane Registration [32.19425880216469]
We propose to overcome the critical issue of modelling multimodality in the space of rotations by using a conditional generative model. Our system is capable of operating from noisy and partially-observed pointcloud observations captured by real world depth cameras.
arXiv Detail & Related papers (2022-08-18T17:10:28Z)
DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools [96.38972082580294]
DiffSkill is a novel framework that uses a differentiable physics simulator for skill abstraction to solve deformable object manipulation tasks. In particular, we first obtain short-horizon skills using individual tools from a gradient-based simulator. We then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input.
arXiv Detail & Related papers (2022-03-31T17:59:38Z)
Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels. Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions. We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z)
IFOR: Iterative Flow Minimization for Robotic Object Rearrangement [92.97142696891727]
IFOR, Iterative Flow Minimization for Robotic Object Rearrangement, is an end-to-end method for the problem of object rearrangement for unknown objects. We show that our method applies to cluttered scenes, and in the real world, while training only on synthetic data.
arXiv Detail & Related papers (2022-02-01T20:03:56Z)
A System for General In-Hand Object Re-Orientation [23.538271727475525]
We present a model-free framework that can learn to reorient objects with both the hand facing upwards and downwards. We demonstrate the capability of reorienting over 2000 geometrically different objects in both cases.
arXiv Detail & Related papers (2021-11-04T17:47:39Z)
Orienting Novel 3D Objects Using Self-Supervised Learning of Rotation Transforms [22.91890127146324]
Orienting objects is a critical component in the automation of many packing and assembly tasks. We train a deep neural network to estimate the 3D rotation as parameterized by a quaternion. We then use the trained network in a proportional controller to re-orient objects based on the estimated rotation between the two depth images.
arXiv Detail & Related papers (2021-05-29T08:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.