Object Detection and Pose Estimation from RGB and Depth Data for
Real-time, Adaptive Robotic Grasping
- URL: http://arxiv.org/abs/2101.07347v1
- Date: Mon, 18 Jan 2021 22:22:47 GMT
- Title: Object Detection and Pose Estimation from RGB and Depth Data for
Real-time, Adaptive Robotic Grasping
- Authors: S. K. Paul, M. T. Chowdhury, M. Nicolescu, M. Nicolescu
- Abstract summary: We propose a system that performs real-time object detection and pose estimation, for the purpose of dynamic robot grasping.
The proposed approach allows the robot to detect the object identity and its actual pose, and then adapt a canonical grasp in order to be used with the new pose.
For training, the system defines a canonical grasp by capturing the relative pose of an object with respect to the gripper attached to the robot's wrist.
During testing, once a new pose is detected, a canonical grasp for the object is identified and then dynamically adapted by adjusting the robot arm's joint angles.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In recent times, object detection and pose estimation have gained significant
attention in the context of robotic vision applications. Both the
identification of objects of interest as well as the estimation of their pose
remain important capabilities in order for robots to provide effective
assistance for numerous robotic applications ranging from household tasks to
industrial manipulation. This problem is particularly challenging because of
the heterogeneity of objects having different and potentially complex shapes,
and the difficulties arising due to background clutter and partial occlusions
between objects. As the main contribution of this work, we propose a system
that performs real-time object detection and pose estimation, for the purpose
of dynamic robot grasping. The robot has been pre-trained to perform a small
set of canonical grasps from a few fixed poses for each object. When presented
with an unknown object in an arbitrary pose, the proposed approach allows the
robot to detect the object identity and its actual pose, and then adapt a
canonical grasp in order to be used with the new pose. For training, the system
defines a canonical grasp by capturing the relative pose of an object with
respect to the gripper attached to the robot's wrist. During testing, once a
new pose is detected, a canonical grasp for the object is identified and then
dynamically adapted by adjusting the robot arm's joint angles, so that the
gripper can grasp the object in its new pose. We conducted experiments using a
humanoid PR2 robot and showed that the proposed framework can detect
well-textured objects, and provide accurate pose estimation in the presence of
tolerable amounts of out-of-plane rotation. The performance is also illustrated
by the robot successfully grasping objects from a wide range of arbitrary
poses.
Related papers
- Precision-Focused Reinforcement Learning Model for Robotic Object Pushing [1.2374541748245842]
Non-prehensile manipulation is an important skill for robots to assist humans in everyday situations.
We introduce a new memory-based vision-proprioception RL model to push objects more precisely to target positions.
arXiv Detail & Related papers (2024-11-13T14:08:58Z) - Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction [52.12746368727368]
Differentiable simulation has become a powerful tool for system identification.
Our approach calibrates object properties by using information from the robot, without relying on data from the object itself.
We demonstrate the effectiveness of our method on a low-cost robotic platform.
arXiv Detail & Related papers (2024-10-04T20:48:38Z) - CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera [18.971816395021488]
Markerless pose estimation methods have eliminated the need for time-consuming physical setups for camera-to-robot calibration.
We propose a novel framework capable of estimating the robot pose with partially visible robot manipulators.
arXiv Detail & Related papers (2024-09-16T16:22:43Z) - 3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects [13.58353565350936]
We contribute methodology to jointly estimate the geometry and pose of objects grasped by a robot.
Our method transforms the estimated geometry into the robot's coordinate frame.
We empirically evaluate our approach on a robot manipulator holding a diverse set of real-world objects.
arXiv Detail & Related papers (2024-07-14T21:02:55Z) - ShapeShift: Superquadric-based Object Pose Estimation for Robotic
Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories.
This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z) - Object Manipulation via Visual Target Localization [64.05939029132394]
Training agents to manipulate objects, poses many challenges.
We propose an approach that explores the environment in search for target objects, computes their 3D coordinates once they are located, and then continues to estimate their 3D locations even when the objects are not visible.
Our evaluations show a massive 3x improvement in success rate over a model that has access to the same sensory suite.
arXiv Detail & Related papers (2022-03-15T17:59:01Z) - V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated
Objects [51.79035249464852]
We present a framework for learning multi-arm manipulation of articulated objects.
Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm.
arXiv Detail & Related papers (2021-11-07T02:31:09Z) - Learning to Regrasp by Learning to Place [19.13976401970985]
Regrasping is needed when a robot's current grasp pose fails to perform desired manipulation tasks.
We propose a system for robots to take partial point clouds of an object and the supporting environment as inputs and output a sequence of pick-and-place operations.
We show that our system is able to achieve 73.3% success rate of regrasping diverse objects.
arXiv Detail & Related papers (2021-09-18T03:07:06Z) - INVIGORATE: Interactive Visual Grounding and Grasping in Clutter [56.00554240240515]
INVIGORATE is a robot system that interacts with human through natural language and grasps a specified object in clutter.
We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping.
We build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules.
arXiv Detail & Related papers (2021-08-25T07:35:21Z) - OmniHang: Learning to Hang Arbitrary Objects using Contact Point
Correspondences and Neural Collision Estimation [14.989379991558046]
We propose a system that takes partial point clouds of an object and a supporting item as input and learns to decide where and how to hang the object stably.
Our system learns to estimate the contact point correspondences between the object and supporting item to get an estimated stable pose.
Then, the robot needs to find a collision-free path to move the object from its initial pose to stable hanging pose.
arXiv Detail & Related papers (2021-03-26T06:11:05Z) - Reactive Human-to-Robot Handovers of Arbitrary Objects [57.845894608577495]
We present a vision-based system that enables human-to-robot handovers of unknown objects.
Our approach combines closed-loop motion planning with real-time, temporally-consistent grasp generation.
We demonstrate the generalizability, usability, and robustness of our approach on a novel benchmark set of 26 diverse household objects.
arXiv Detail & Related papers (2020-11-17T21:52:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.