Distributed Reinforcement Learning of Targeted Grasping with Active
Vision for Mobile Manipulators
- URL: http://arxiv.org/abs/2007.08082v2
- Date: Wed, 14 Oct 2020 08:59:39 GMT
- Title: Distributed Reinforcement Learning of Targeted Grasping with Active
Vision for Mobile Manipulators
- Authors: Yasuhiro Fujita, Kota Uenishi, Avinash Ummadisingu, Prabhat Nagarajan,
Shimpei Masuda, and Mario Ynocente Castro
- Abstract summary: We present the first RL-based system for a mobile manipulator that can (a) achieve targeted grasping generalizing to unseen target objects, (b) learn complex grasping strategies for cluttered scenes with occluded objects, and (c) perform active vision through its movable wrist camera to better locate objects.
We train and evaluate our system in a simulated environment, identify key components for improving performance, analyze its behaviors, and transfer to a real-world setup.
- Score: 4.317864702902075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing personal robots that can perform a diverse range of manipulation
tasks in unstructured environments necessitates solving several challenges for
robotic grasping systems. We take a step towards this broader goal by
presenting the first RL-based system, to our knowledge, for a mobile
manipulator that can (a) achieve targeted grasping generalizing to unseen
target objects, (b) learn complex grasping strategies for cluttered scenes with
occluded objects, and (c) perform active vision through its movable wrist
camera to better locate objects. The system is informed of the desired target
object in the form of a single, arbitrary-pose RGB image of that object,
enabling the system to generalize to unseen objects without retraining. To
achieve such a system, we combine several advances in deep reinforcement
learning and present a large-scale distributed training system using
synchronous SGD that seamlessly scales to multi-node, multi-GPU infrastructure
to make rapid prototyping easier. We train and evaluate our system in a
simulated environment, identify key components for improving performance,
analyze its behaviors, and transfer to a real-world setup.
Related papers
- Object and Contact Point Tracking in Demonstrations Using 3D Gaussian Splatting [17.03927416536173]
This paper introduces a method to enhance Interactive Imitation Learning (IIL) by extracting touch interaction points and tracking object movement from video demonstrations.
The approach extends current IIL systems by providing robots with detailed knowledge of both where and how to interact with objects, particularly complex articulated ones like doors and drawers.
arXiv Detail & Related papers (2024-11-05T23:28:57Z) - Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Cognitive Planning for Object Goal Navigation using Generative AI Models [0.979851640406258]
We present a novel framework for solving the object goal navigation problem that generates efficient exploration strategies.
Our approach enables a robot to navigate unfamiliar environments by leveraging Large Language Models (LLMs) and Large Vision-Language Models (LVLMs)
arXiv Detail & Related papers (2024-03-30T10:54:59Z) - Modular Neural Network Policies for Learning In-Flight Object Catching
with a Robot Hand-Arm System [55.94648383147838]
We present a modular framework designed to enable a robot hand-arm system to learn how to catch flying objects.
Our framework consists of five core modules: (i) an object state estimator that learns object trajectory prediction, (ii) a catching pose quality network that learns to score and rank object poses for catching, (iii) a reaching control policy trained to move the robot hand to pre-catch poses, and (iv) a grasping control policy trained to perform soft catching motions.
We conduct extensive evaluations of our framework in simulation for each module and the integrated system, to demonstrate high success rates of in-flight
arXiv Detail & Related papers (2023-12-21T16:20:12Z) - Kinematic-aware Prompting for Generalizable Articulated Object
Manipulation with LLMs [53.66070434419739]
Generalizable articulated object manipulation is essential for home-assistant robots.
We propose a kinematic-aware prompting framework that prompts Large Language Models with kinematic knowledge of objects to generate low-level motion waypoints.
Our framework outperforms traditional methods on 8 categories seen and shows a powerful zero-shot capability for 8 unseen articulated object categories.
arXiv Detail & Related papers (2023-11-06T03:26:41Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Efficient and Robust Training of Dense Object Nets for Multi-Object
Robot Manipulation [8.321536457963655]
We propose a framework for robust and efficient training of Dense Object Nets (DON)
We focus on training with multi-object data instead of singulated objects, combined with a well-chosen augmentation scheme.
We demonstrate the robustness and accuracy of our proposed framework on a real-world robotic grasping task.
arXiv Detail & Related papers (2022-06-24T08:24:42Z) - V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated
Objects [51.79035249464852]
We present a framework for learning multi-arm manipulation of articulated objects.
Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm.
arXiv Detail & Related papers (2021-11-07T02:31:09Z) - Long-Horizon Manipulation of Unknown Objects via Task and Motion
Planning with Estimated Affordances [26.082034134908785]
We show that a task-and-motion planner can be used to plan intelligent behaviors even in the absence of a priori knowledge regarding the set of manipulable objects.
We demonstrate that this strategy can enable a single system to perform a wide variety of real-world multi-step manipulation tasks.
arXiv Detail & Related papers (2021-08-09T16:13:47Z) - MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale [103.7609761511652]
We show how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously.
New tasks can be continuously instantiated from previously learned tasks.
We train and evaluate our system on a set of 12 real-world tasks with data collected from 7 robots.
arXiv Detail & Related papers (2021-04-16T16:38:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.